Skip to main content

Advanced LLM-native memory orchestration with dual-agent synthesis, conflict-aware evolution, multi-tenant vector DB backends, and async SDK/API tooling.

Project description

OmniMemory · The Living Brain for Autonomous Agents

Python 3.10+ License: MIT Coverage

Don't just store data. Synthesize memories. OmniMemory transforms static embeddings into a self-evolving cognitive substrate.

Quick Start · CLI · SDK · Production · Environment Variables · Architecture · REST API


Why OmniMemory?

Traditional RAG is a filing cabinet: you put documents in, you take documents out. OmniMemory is a living brain.

It doesn't just "store" messages. It employs a Dual-Agent Synthesis engine to interpret conversations, extract behavioral patterns, and resolve contradictions automatically. When a new memory conflicts with an old one, OmniMemory doesn't just append—it updates, deletes, or consolidates the knowledge graph, just like human memory.

Feature Traditional Vector RAG OmniMemory (SECMSA)
Input Handling Naive chunking & embedding Dual-Agent Synthesis (Episodic + Summarizer)
Conflict Resolution None (contradictions coexist) Self-Evolving (Update/Delete/Skip operations)
Retrieval Logic Cosine similarity only Composite Scoring (Relevance × [1 + Recency + Importance])
Context Awareness Static text chunks Structured Memory Notes (Behavior, Learnings, Guidance)
Multi-Tenancy Often manual filtering Native Isolation (App / User / Session tiers)

Core Features

1. Dual-Agent Synthesis

Two specialized agents work in parallel to process every interaction:

  • Episodic Agent: Analyzes behavior. "User prefers concise answers," "User struggles with async concepts."
  • Summarizer Agent: Analyzes narrative. "Project X is delayed," "Deployed v2.0 to prod."

2. Self-Evolving Memory

Memories aren't static. The system automatically detects conflicts between new and existing information.

  • UPDATE: Merges fragmented details into a single, comprehensive note.
  • DELETE: Removes outdated or contradicted information.
  • SKIP: Ignores redundant inputs to keep the index clean.

3. Composite Scoring

We don't just return the "nearest neighbor." We return the most useful memory.

Score = Relevance * (1 + Recency_Boost + Importance_Boost)

This ensures high-relevance memories always win, but recent and critical memories get the nudge they need to surface.

4. Enterprise Multi-Tenancy

Built for SaaS from day one.

  • App Level: Physical isolation (separate collections).
  • User Level: Logical isolation (metadata filtering).
  • Session Level: Conversation grouping.

Supported Backends

Switch providers by changing OMNI_MEMORY_PROVIDER. No code changes required.

Provider Env Value Best For
Qdrant qdrant-remote Production default. High performance, rich filtering
ChromaDB chromadb-remote Simple deployments, local development
PostgreSQL postgresql Teams already using Postgres (via pgvector)
MongoDB mongodb Atlas users needing vector search + document store

When to Use: API vs SDK

Use the REST API Server (Recommended for Production)

Why: Language-agnostic. Works with any programming language (Node.js, Go, Rust, Java, PHP, etc.)

Best For:

  • ✅ Production deployments
  • ✅ Multi-language teams
  • ✅ Microservices architectures
  • ✅ Need built-in metrics, health checks, connection pooling

Use the Python SDK (Dev/Prototyping)

Why: Direct Python integration for rapid testing

Best For:

  • ✅ Python-only agents
  • ✅ Local development and testing
  • ✅ Prototyping memory operations

Quick Start

TL;DR: Want to see it in action immediately?

# Run the complete Customer Support Agent example
python examples/complete_sdk_example.py

1. Install

uv add omnimemory
# or
pip install omnimemory

2. Configure

Create a .env file (templates in examples/env/):

# LLM & Embeddings
LLM_API_KEY=sk-...
LLM_PROVIDER=openai
EMBEDDING_API_KEY=sk-...
EMBEDDING_PROVIDER=openai

# Vector DB (Choose one: qdrant, chromadb, postgresql, mongodb)
OMNI_MEMORY_PROVIDER=qdrant-remote
QDRANT_HOST=localhost
QDRANT_PORT=6333

3. Run (Choose Your Backend)

Start the vector DB and API server:

Qdrant (Production Default - High Performance)

docker compose -f docker-compose.local.yml --profile qdrant up -d
uv run uvicorn omnimemory.api.server:app --host 0.0.0.0 --port 8001 --reload

ChromaDB (Simple Deployments)

docker compose -f docker-compose.local.yml --profile chromadb up -d
uv run uvicorn omnimemory.api.server:app --host 0.0.0.0 --port 8001 --reload

PostgreSQL (Existing Postgres Users)

docker compose -f docker-compose.local.yml --profile pgvector up -d
uv run uvicorn omnimemory.api.server:app --host 0.0.0.0 --port 8001 --reload

MongoDB (Configure MongoDB Atlas separately)

# Set MONGO_URI in .env first
uv run uvicorn omnimemory.api.server:app --host 0.0.0.0 --port 8001 --reload

4. Use (Python SDK)

from omnimemory.sdk import OmniMemorySDK
from omnimemory.core.schemas import UserMessages, Message
import asyncio

async def main():
    sdk = OmniMemorySDK()
    
    # CRITICAL: Initialize connection pools
    if not await sdk.warm_up():
        print("Failed to warm up SDK")
        return

    # Add a memory (returns a background task ID)
    response = await sdk.add_memory(UserMessages(
        app_id="my-app",
        user_id="user-123",
        messages=[
            Message(role="user", content="I'm building a Python web scraper."),
            Message(role="assistant", content="I can help with libraries like BeautifulSoup.")
        ] * 5 # Need sufficient context (default 10 messages)
    ))
    
    task_id = response["task_id"]
    print(f"Memory processing started. Task ID: {task_id}")
    
    # Fire-and-Forget: Memory processes in background
    # No need to poll - check logs if debugging needed
    await asyncio.sleep(3)  # Give it time to process

    # Query memory (semantic search)
    results = await sdk.query_memory(
        app_id="my-app",
        user_id="user-123",
        session_id="session-123", # Optional
        n_results=1, # Optional
        similarity_threshold=0.7, # Optional
        query="What is the user working on?"
    )
    
    print(results[0]['memory_note']) 
    # Output: "User is developing a Python-based web scraper..."

asyncio.run(main())

5. Quick Test Examples

Want to see it in action? We provide complete, real-world examples for both SDK and API usage.

Run the SDK Example:

# Demonstrates full customer support workflow with memory batching
python examples/complete_sdk_example.py

Run the API Example:

# Requires running server (uv run uvicorn omnimemory.api.server:app --port 8001)
python examples/complete_api_example.py

Production Features

Fully Asynchronous: O(1) latency from user perspective. Memory synthesis happens in fire-and-forget background tasks. No polling needed - check logs if debugging required.

Connection Pooling: Intelligent pool management with configurable size (default 10). Initializes with 50% of max connections for optimal startup performance, then scales on-demand to handle concurrent workloads.

Metrics & Observability: Prometheus-compatible metrics at http://localhost:9001/metrics (enable with OMNIMEMORY_ENABLE_METRICS_SERVER=true).

Multi-Tenancy: 3-tier isolation (app/user/session) for SaaS deployments. Complete data separation.

89.58% Test Coverage: Production-grade reliability with comprehensive test suite.

Language Agnostic: REST API works with any language. Python SDK provided for convenience.


Agent Memory SDK

For Agents That Need to Answer Questions Using Stored Memories

The AgentMemorySDK provides a complete "query memory + generate answer" loop. It retrieves relevant memories and calls your LLM with context to generate grounded responses.

from omnimemory import AgentMemorySDK

agent_sdk = AgentMemorySDK()

response = await agent_sdk.answer_query(
    app_id="my-app-id-1234",
    query="What does the user prefer?",
    user_id="user-123456", # Optional
    session_id="session-123456", # Optional
    n_results=5, # Optional
    similarity_threshold=0.7 # Optional
)

print(response["answer"])  # LLM-generated answer
print(f"Based on {len(response['memories'])} memories")

Use When: Your agent needs to answer user questions using stored memories.

Not For: Storing new memories (use OmniMemorySDK.add_memory or add_agent_memory for that).


CLI Tool

OmniMemory includes a powerful command-line interface for quick operations and testing:

# Install
uv add omnimemory

# Get help
omnimemory --help

# Start daemon for background operations
omnimemory daemon start

# Add memory
omnimemory memory add \
  --app-id "myapp-1234567890" \
  --user-id "user-1234567890" \
  --message "user:I prefer dark mode" \
  --message "assistant:Noted, I'll remember that"

# Query memory
omnimemory memory query \
  --app-id "myapp-1234567890" \
  --query "user preferences"

# Check system health
omnimemory health

# View comprehensive feature guide
omnimemory info

# Daemon management
omnimemory daemon status
omnimemory daemon stop

Available Commands:

  • omnimemory memory - Memory operations (add, query, get, delete)
  • omnimemory memory batch - Batch message operations
  • omnimemory daemon - Background daemon management
  • omnimemory agent - Agent-specific operations
  • omnimemory health - System health diagnostics
  • omnimemory info - Feature overview

For detailed CLI documentation, run omnimemory --help.


SDK Usage Guide

Note: This guide covers the Python SDK. For the HTTP REST API, see API_SPECIFICATION.md.

Initialization

CRITICAL: You must warm up the connection pools before making requests.

Why: Initializes vector DB connections for low latency on first request.

from omnimemory.sdk import OmniMemorySDK

sdk = OmniMemorySDK()
success = await sdk.warm_up()
if not success:
    print("Failed to initialize connections")

Core Memory Operations

1. Add Memory (add_memory)

Use Case: Primary engine for conversation analysis with Dual-Agent Synthesis.

Why: Needs a flow of conversation (default 10 messages) to understand context. The Episodic and Summarizer agents extract behavioral patterns and resolve conflicts. Single messages won't work.

Parameters:

  • user_message: UserMessages - Contains app_id, user_id, session_id (optional), messages (list of Message objects)
  • messages must have exactly OMNIMEMORY_DEFAULT_MAX_MESSAGES (default 10)

Returns: Task ID immediately (async processing)

from omnimemory.core.schemas import UserMessages, Message

response = await sdk.add_memory(UserMessages(
    app_id="my-app-id-1234",
    user_id="user-123456",
    session_id="session-789",  # Optional
    messages=[
        Message(role="user", content="I prefer dark mode"),
        Message(role="assistant", content="Noted, I'll remember that")
        # ... total 10 messages required
    ]
))
task_id = response["task_id"]
print(f"Processing in background: {task_id}")

2. Add Agent Memory (add_agent_memory)

Use Case: Agent Tool for quick saves.

Why: When your agent learns new info or user says "save this," the agent calls this directly. Accepts both structured and unstructured messages. Bypasses conflict resolution for speed.

Best Practice: Add to agent system prompt as a tool.

Parameters:

  • agent_request: AgentMemoryRequest - Contains app_id, user_id, session_id (optional), messages (string or list)

Returns: Task ID immediately

from omnimemory.core.schemas import AgentMemoryRequest

# Unstructured (string)
response = await sdk.add_agent_memory(AgentMemoryRequest(
    app_id="my-app-id-1234",
    user_id="user-123456",
    messages="User completed premium signup and selected annual plan"
))

# Structured (list)
response = await sdk.add_agent_memory(AgentMemoryRequest(
    app_id="my-app-id-1234",
    user_id="user-123456",
    messages=[
        {"role": "user", "content": "What's my email?"},
        {"role": "assistant", "content": "It's user@example.com"}
    ]
))

3. Query Memory (query_memory)

Use Case: Retrieve memories using semantic search and composite scoring.

How: Uses Relevance × (1 + Recency + Importance) scoring.

Parameters:

  • app_id: str (required)
  • query: str (required) - Natural language query
  • user_id: str (optional) - Filter by user
  • session_id: str (optional) - Filter by session
  • n_results: int (optional, default from env) - Max results to return
  • similarity_threshold: float (optional, default from env) - Min similarity (0.0-1.0). Overrides OMNIMEMORY_RECALL_THRESHOLD env var.

Returns: List of memory dictionaries

Query Best Practices:

💡 TIP: Specific queries yield better results than generic ones.

Query Type Example Expected Score Quality
Too Generic "what is machine learning" 0.20-0.30 Poor - too broad
⚠️ Somewhat Generic "neural networks" 0.30-0.40 Fair - lacks context
Specific "how to implement neural networks with backpropagation" 0.40-0.55 Good - targeted
Very Specific "troubleshooting slow loss decrease in neural network training" 0.50-0.65 Excellent - precise

Understanding Similarity Scores:

  • 0.20-0.35: Weakly related content (consider lowering threshold if needed)
  • 0.35-0.55: Semantically related (typical for good matches)
  • 0.55-0.75: Strong match (rare, usually requires similar phrasing)
  • 0.75-1.00: Near-identical content (very rare in practice)

Note: Composite scoring boosts relevant memories with recency/importance, so a 0.45 similarity can become 0.60+ composite score.

results = await sdk.query_memory(
    app_id="my-app-id-1234",
    query="What does the user like?",
    user_id="user-123456",          # Optional
    session_id="session-789",       # Optional
    n_results=10,                   # Optional (default 5)
    similarity_threshold=0.75       # Optional (overrides env default 0.3)
)

for memory in results:
    print(memory["document"])
    print(f"Score: {memory['composite_score']}")

4. Get Memory (get_memory)

Use Case: Retrieve a single memory by its ID.

Why: When you have a memory ID from a previous operation and need full content.

Parameters:

  • memory_id: str (required)
  • app_id: str (required)

Returns: Memory dict or None

memory = await sdk.get_memory(
    memory_id="uuid-1234-5678",
    app_id="my-app-id-1234"
)
if memory:
    print(memory["document"])

5. Delete Memory (delete_memory)

Use Case: Manual memory deletion (GDPR, cleanup).

Why: User requests deletion or you need to remove test data.

Parameters:

  • app_id: str (required)
  • doc_id: str (required) - Document ID to delete

Returns: Boolean (success/failure)

success = await sdk.delete_memory(
    app_id="my-app-id-1234",
    doc_id="uuid-1234-5678"
)
if success:
    print("Memory deleted")

Summarization

Summarize Conversation (summarize_conversation)

Use Case: Context Window Management.

Why: When working memory is full, generate a summary, save it, delete old messages to free tokens.

Accepts: Both structured and unstructured messages.

Two Modes:

Sync Mode (No callback_url)

  • Returns: Summary immediately
  • Processing: Fast (use_fast_path=True)
  • Use When: Real-time responses needed, short contexts
from omnimemory.core.schemas import ConversationSummaryRequest

summary = await sdk.summarize_conversation(ConversationSummaryRequest(
    app_id="my-app-id-1234",
    user_id="user-123456",
    messages=[...]  # Structured or unstructured
))
print(summary["content"])
print(summary["delivery"])  # "sync"

Webhook Mode (With callback_url)

  • Returns: Task ID immediately
  • Processing: Full structured summary (use_fast_path=False)
  • Delivery: POSTs result to your webhook URL
  • Retry: 3 attempts with exponential backoff
  • Use When: Long conversations, background processing, need auto-replacement

Parameters:

  • summary_request: ConversationSummaryRequest
    • app_id: str (required)
    • user_id: str (required)
    • session_id: str (optional)
    • messages: str | list (required)
    • callback_url: str (optional) - If provided, enables webhook mode
    • callback_headers: dict (optional) - Custom headers for webhook (e.g., auth)
response = await sdk.summarize_conversation(ConversationSummaryRequest(
    app_id="my-app-id-1234",
    user_id="user-123456",
    messages=[...],  # Long conversation
    callback_url="https://api.myapp.com/webhooks/summary",
    callback_headers={"Authorization": "Bearer token123"}
))
print(response["task_id"])
print(response["status"])  # "accepted"

Batching

Memory Batcher (memory_batcher_add_message)

Use Case: Streaming chat loops.

Why: Automatically buffers messages and calls add_memory when limit is reached. No manual counting.

How: Non-blocking. Monitors message count per (app_id, user_id, session_id) tuple. When it hits OMNIMEMORY_DEFAULT_MAX_MESSAGES (default 10), auto-flushes.

Parameters:

  • app_id: str (required)
  • user_id: str (required)
  • session_id: str (optional)
  • role: str (required) - "user", "assistant", "system"
  • content: str (required)
# In your chat loop
for message in stream:
    await sdk.memory_batcher_add_message(
        app_id="my-app-id-1234",
        user_id="user-123456",
        role=message.role,
        content=message.content
    )
    # SDK handles auto-flush at 10 messages

Evolution & Auditing

1. Traverse Evolution Chain (traverse_memory_evolution_chain)

Use Case: See how a memory evolved over time.

Why: Memories update, delete, merge. This traces the full history.

How: Follows next_id pointers from original to final memory.

Parameters:

  • app_id: str (required)
  • memory_id: str (required) - Starting memory ID

Returns: List of memories in chronological order

chain = await sdk.traverse_memory_evolution_chain(
    app_id="my-app-id-1234",
    memory_id="original-uuid-1234"
)
print(f"Memory evolved {len(chain)} times")
for memory in chain:
    print(f"{memory['metadata']['status']} - {memory['document'][:50]}")

2. Generate Evolution Graph (generate_evolution_graph)

Use Case: Visualize evolution chain.

Formats: mermaid, dot, html

Parameters:

  • chain: List[Dict] (required) - Output from traverse_memory_evolution_chain
  • format: str (required) - "mermaid", "dot", or "html"
chain = await sdk.traverse_memory_evolution_chain(...)

# Mermaid (for docs)
mermaid = sdk.generate_evolution_graph(chain, format="mermaid")
print(mermaid)

# HTML (for browser visualization)
html = sdk.generate_evolution_graph(chain, format="html")
with open("evolution.html", "w") as f:
    f.write(html)

3. Generate Evolution Report (generate_evolution_report)

Use Case: Detailed analysis of memory changes.

Formats: markdown, text, json

Parameters:

  • chain: List[Dict] (required)
  • format: str (required) - "markdown", "text", or "json"
report = sdk.generate_evolution_report(chain, format="markdown")
print(report)  # Includes stats, timeline, insights

System & Monitoring

Connection Pool Stats (get_connection_pool_stats)

Use Case: Production monitoring and debugging.

Why: If queries are slow, check if you're hitting connection limits.

Returns: Dict with pool metrics

stats = await sdk.get_connection_pool_stats()
print(f"Active: {stats['active_handlers']}/{stats['max_connections']}")
print(f"Available: {stats['available_handlers']}")

Configuration & Tuning

Tune these hyperparameters in your .env file to optimize for your specific use case.

Parameter Default Description Tuning Guide
OMNIMEMORY_RECALL_THRESHOLD 0.3 Minimum cosine similarity for initial retrieval from Vector DB. Lower to 0.2-0.25 for broader recall. Note: Typical good matches score 0.35-0.55, not 0.7+. Specific queries perform better than generic ones.
OMNIMEMORY_COMPOSITE_SCORE_THRESHOLD 0.5 Minimum final score (Relevance × Boosts) to return a memory. Lower to 0.35-0.4 for more results. Composite scoring boosts base similarity with recency/importance, so a 0.45 similarity can become 0.60+ composite score.
OMNIMEMORY_LINK_THRESHOLD 0.7 Similarity required to "link" memories for conflict resolution. Lower to 0.6 to trigger evolution/updates more often. Raise to 0.8 to reduce "noise" and only link very similar topics.
OMNIMEMORY_DEFAULT_MAX_MESSAGES 10 Number of messages required for add_memory. Match this to your LLM's context window preference. Too low = poor synthesis; Too high = context bloat.
OMNIMEMORY_VECTOR_DB_MAX_CONNECTIONS 10 Max concurrent DB connections. Pool initializes with 50% of this value, then scales on-demand. Reduce to 3-5 for low-resource environments (e.g., local dev). Increase to 20-30 for high-throughput production. Initial pool will be half of this value.

Environment Variables

Required Variables

Variable Description Example
LLM_API_KEY LLM provider API key sk-...
LLM_PROVIDER LLM provider name openai, anthropic, mistral
EMBEDDING_API_KEY Embedding provider API key sk-...
EMBEDDING_PROVIDER Embedding provider openai
OMNI_MEMORY_PROVIDER Vector DB backend qdrant-remote, chromadb-remote, postgresql, mongodb

LLM Configuration

Variable Default Description
LLM_MODEL - Model name (e.g., gpt-4, claude-3-opus)
LLM_TEMPERATURE 0.4 Creativity (0.0-2.0)
LLM_MAX_TOKENS 3000 Max response tokens
LLM_TOP_P 0.9 Nucleus sampling

Embedding Configuration

Variable Default Description
EMBEDDING_MODEL - Embedding model name
EMBEDDING_DIMENSIONS - Vector dimensions
EMBEDDING_ENCODING_FORMAT base64 Response encoding format
EMBEDDING_TIMEOUT 600 Request timeout (seconds)

Vector Database

Qdrant:

  • QDRANT_HOST - Qdrant server host
  • QDRANT_PORT - Qdrant port (default 6333)

ChromaDB:

  • CHROMA_HOST - ChromaDB server host
  • CHROMA_PORT - ChromaDB port (default 8000)
  • CHROMA_AUTH_TOKEN - Authentication token
  • CHROMA_CLIENT_TYPE - Client type (remote for server)

PostgreSQL:

  • POSTGRES_URI - Full connection string (e.g., postgresql://user:pass@host:5432/db)

MongoDB:

  • MONGO_URI - MongoDB Atlas connection string

Observability

Variable Default Description
OMNIMEMORY_ENABLE_METRICS_SERVER false Enable Prometheus metrics endpoint
OMNIMEMORY_METRICS_PORT 9001 Metrics HTTP server port
LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR)
LOG_DIR ./logs Log file directory path

Production Deployment

Step 1: Prepare Environment Variables

Create a .env file with all required configuration:

# LLM Configuration (Required)
LLM_API_KEY=your-api-key-here
LLM_PROVIDER=openai
LLM_MODEL=gpt-4
LLM_TEMPERATURE=0.4
LLM_MAX_TOKENS=3000

# Embedding Configuration (Required)
EMBEDDING_API_KEY=your-api-key-here
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_DIMENSIONS=1536

# Vector Database (Choose one)
OMNI_MEMORY_PROVIDER=qdrant-remote
QDRANT_HOST=your-qdrant-host.com
QDRANT_PORT=6333

# OmniMemory Hyperparameters (Optional - tune for your use case)
OMNIMEMORY_DEFAULT_MAX_MESSAGES=10
OMNIMEMORY_RECALL_THRESHOLD=0.3
OMNIMEMORY_COMPOSITE_SCORE_THRESHOLD=0.4
OMNIMEMORY_LINK_THRESHOLD=0.8
OMNIMEMORY_VECTOR_DB_MAX_CONNECTIONS=10

# Metrics & Observability (Optional)
OMNIMEMORY_ENABLE_METRICS_SERVER=true
OMNIMEMORY_METRICS_PORT=9001
LOG_LEVEL=INFO

Step 2: Deploy with Docker Compose

# Start with your chosen backend
docker compose -f docker-compose.local.yml --profile qdrant up -d

Step 3: Production Hardening

⚠️ CRITICAL SECURITY WARNING: The provided docker-compose.local.yml is designed for local development only. For production deployments, you MUST implement the following security measures:

1. Enable HTTPS

Add a reverse proxy (nginx, Traefik, or Caddy) with SSL certificates:

# Add to your docker-compose
services:
  nginx:
    image: nginx:alpine
    ports:
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
      - /etc/letsencrypt:/etc/letsencrypt:ro
    depends_on:
      - api-qdrant

2. Implement Authentication

  • Use API keys in request headers
  • Implement OAuth 2.0 or JWT tokens
  • Configure authentication middleware in nginx/Traefik

3. Use Secrets Management

# Don't use .env files in production
# Use Docker secrets or cloud provider secrets management
docker secret create llm_api_key ./llm_key.txt
docker secret create embedding_api_key ./embedding_key.txt

4. Network Security

# Configure firewall to only expose necessary ports:
# - 443 (HTTPS) - public facing
# - 6333 (Qdrant) - internal network only
# - 9001 (Metrics) - internal network only

# Example: UFW firewall rules
sudo ufw allow 443/tcp
sudo ufw enable

5. Enable Monitoring

# Start with monitoring profile for Prometheus + Grafana
docker compose -f docker-compose.local.yml \
  --profile qdrant \
  --profile monitoring up -d

# Access Grafana at http://localhost:3000 (default: admin/admin)
# Access Prometheus at http://localhost:9090
# Configure alerts and dashboards for production monitoring

6. Backup & Disaster Recovery

  • Configure automated backups for your vector database
  • Test recovery procedures regularly
  • Use persistent volumes for data

Development & Testing

Running Tests

# Install development dependencies
uv sync

# Run all tests
uv run pytest

# Run with coverage report
uv run pytest --cov=src/omnimemory --cov-report=html

# View coverage report
open htmlcov/index.html

Current Test Coverage: 89.58%

Running Locally

# Start vector database
docker compose -f docker-compose.local.yml --profile qdrant up -d

# Run API server in development mode
uv run uvicorn omnimemory.api.server:app --host 0.0.0.0 --port 8001 --reload

# Or use the provided script
python run_api_server.py

Architecture

OmniMemory implements the Self-Evolving Composite Memory Synthesis Architecture (SECMSA).

For comprehensive architecture documentation:

  • ARCHITECTURE.md - Deep dive into SECMSA, mathematical foundations, scoring algorithms, conflict resolution, and design decisions
  • C4_ARCHITECTURE.md - Visual system architecture with PlantUML diagrams:
    • Level 1: System Context
    • Level 2: Container Diagram
    • Level 3: Component Diagram
    • Level 4: Code Structure
    • Sequence diagrams for memory creation and retrieval flows

Contributing

We welcome contributions! Please see our Contributing Guide for details.

  1. Clone: git clone https://github.com/omnirexflora-labs/omnimemory
  2. Sync: uv sync --group dev
  3. Test: uv run pytest

License

MIT © OmniRexFlora Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omnimemory-0.0.1.tar.gz (121.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omnimemory-0.0.1-py3-none-any.whl (128.5 kB view details)

Uploaded Python 3

File details

Details for the file omnimemory-0.0.1.tar.gz.

File metadata

  • Download URL: omnimemory-0.0.1.tar.gz
  • Upload date:
  • Size: 121.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for omnimemory-0.0.1.tar.gz
Algorithm Hash digest
SHA256 82f0ba547e12f5089fcd18b765150957b7cf9500cce7f0589851b559255c71df
MD5 2f46335691d384fd8c060d14aac0a8da
BLAKE2b-256 b3cc52c06b168d47d45a29a15138f56a64dc267948435cdca980975e6b2aab26

See more details on using hashes here.

File details

Details for the file omnimemory-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: omnimemory-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 128.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for omnimemory-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0999b0db9a453f37284bcd3c4387d9e98417a543dbe73140c9d8079de59334e4
MD5 577fbf262632b40d875e73d322296c5c
BLAKE2b-256 34ff03ebaa7ff1daca0f24fb402bb19065142b5645fa7eaa0b611b2849647b4b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page