A scalable memory system for AI agents using graph-based sharding and hierarchical clustering.

Project description

Lazzaro

Scalable Memory System Library

Lazzaro is a Python library designed to give AI agents long-term, scalable, and structured memory. Unlike simple vector databases, Lazzaro uses a Graph-based approach combined with Memory Sharding and Hierarchical Clustering to mimic how human memory works: storing active context in a buffer, consolidating short-term interactions into long-term structures, and forgetting irrelevant details over time.

Installation

pip install lazzaro

How It Works

Lazzaro operates on a few core principles to manage memory scalability and relevance:

1. Architecture

Sharding: Memories are automatically categorized into shards (e.g., work, personal, health) based on content. This allows the system to retrieve only relevant slices of memory, keeping searches fast.
Buffer Graph: Active memories live in a dynamic graph structure where nodes are facts/thoughts and edges are relationships (associations).
Persistence: State is automatically persisted to local disk (db/lazzaro.pkl) using fast binary serialization.

2. Memory Lifecycle

Short-Term Memory (STM): Every user interaction is initially stored in a temporary list.
Consolidation: When a conversation ends (or periodically), Lazzaro runs a background process to:
- Extract atomic facts from the conversation using an LLM.
- Embed these facts and insert them into the appropriate Shard.
- Link new memories to existing related memories (Graph edges).
Forgetting: A buffer limit enforces strict discipline. Old, unused, or low-salience memories are "pruned" (archived/deleted) to keep the active graph lightweight.

3. Hierarchy & Super-Nodes

When a shard grows too large, Lazzaro automatically clusters related nodes under a Super-Node. This creates a hierarchical index, allowing retrieval to scan high-level topics first before diving into granular details, significantly improving retrieval performance at scale.

Usage

CLI (Interactive Mode)

The easiest way to use Lazzaro is via the command-line interface.

lazzaro-cli

Common Commands:

/start: Begin a new conversation session.
/end: End the current session and trigger background consolidation.
/stats: View current graph size, cache hit rates, and retrieval latency.
/set <param> <value>: Update configuration (e.g., /set max_buffer_size 50).
/save <filename>: Export current state to a JSON file.

Python API

Integrate Lazzaro into your own applications:

from lazzaro import MemorySystem
import os

# Initialize the system
# It will automatically load previous state from db/lazzaro.pkl if it exists
ms = MemorySystem(
    openai_api_key=os.getenv("OPENAI_API_KEY"),
    enable_async=True,
    auto_consolidate=True
)

# 1. Start a session
ms.start_conversation()

# 2. Chat with memory context
# The system retrieves relevant memories and injects them into the context
# Use chat_stream to get a streaming response iterator
print("Assistant: ", end="", flush=True)
for token in ms.chat_stream("I'm working on the new physics engine today."):
     if token['type'] == 'token':
         print(token['content'], end="", flush=True)
print()

# 3. Add explicit memories (optional)
ms.add_to_short_term("Project deadline is next Friday.", memory_type="fact")

# 4. End session to trigger consolidation
# This extracts facts, updates the graph, and saves to disk
print(ms.end_conversation())

Framework Integration

Using LangChain

Lazzaro allows you to bring your own LLM backend. Here is how to use ChatOpenAI (or any other LangChain chat model) as the reasoning engine for Lazzaro.

from lazzaro import MemorySystem
from lazzaro.core.interfaces import LLMProvider
from langchain_openai import ChatOpenAI
from typing import List, Dict

class LangChainAdapter(LLMProvider):
    def __init__(self, model_name: str = "gpt-4"):
        self.model = ChatOpenAI(model=model_name, temperature=0.7)
    
    def completion(self, messages: List[Dict[str, str]], response_format: Dict = None) -> str:
        # 1. Convert Lazzaro messages ({'role': '...', 'content': '...'}) 
        #    to LangChain format if necessary, or pass a simple prompt.
        #    For robust chat, we just use the last user message as the prompt here,
        #    but you could build a full ChatPromptTemplate.
        last_message = messages[-1]['content']
        
        # 2. Handle JSON enforcement if requested (Lazzaro uses this for extraction)
        if response_format and response_format.get("type") == "json_object":
             # In a real app, use .with_structured_output() or prompt engineering
             last_message += "\nIMPORTANT: Return valid JSON only."

        # 3. Invoke the LangChain model
        response = self.model.invoke(last_message)
        return response.content
    
    def completion_stream(self, messages: List[Dict[str, str]], response_format: Dict = None):
         # Implement streaming if desired
         pass

# Initialize Lazzaro with your custom adapter
ms = MemorySystem(
    openai_api_key="...",  # Required for default EmbeddingProvider (unless replaced)
    llm_provider=LangChainAdapter(model_name="gpt-4-turbo"),
    # embedding_provider=MyEmbeddingAdapter()  # Optional: Replace embedder too
)

ms.start_conversation()
print(ms.chat("Hello! I'm using LangChain under the hood."))

Configuration

Lazarus is highly configurable. You can adjust these settings during initialization or via the CLI.

Parameter	Default	Description
`auto_consolidate`	`True`	Automatically extract facts and update graph after every N conversations.
`consolidate_every`	`3`	Frequency of full consolidation runs (in number of conversations).
`max_buffer_size`	`10`	Maximum number of active nodes in the graph before older ones are pruned.
`enable_async`	`True`	Run consolidation and embedding tasks in background threads for responsiveness.
`enable_sharding`	`True`	Organize memories into semantic topics (`work`, `personal`) or date-based shards.
`enable_hierarchy`	`True`	Create "Super-Nodes" to summarize large clusters of memories.
`load_from_disk`	`True`	Automatically reload the last saved state on initialization.

Project details

Release history Release notifications | RSS feed

0.2.5.3

Dec 25, 2025

0.2.5.2

Dec 25, 2025

0.2.5.1

Dec 25, 2025

0.2.5

Dec 25, 2025

0.2.4

Dec 25, 2025

0.2.3

Dec 25, 2025

0.2.2

Dec 25, 2025

0.2.1

Dec 25, 2025

0.2.0

Dec 25, 2025

0.1.2.3

Dec 25, 2025

0.1.2.2

Dec 25, 2025

This version

0.1.2.1

Dec 24, 2025

0.1.2

Dec 24, 2025

0.1.1

Dec 24, 2025

0.1.0

Dec 24, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lazzaro-0.1.2.1.tar.gz (24.2 kB view details)

Uploaded Dec 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lazzaro-0.1.2.1-py3-none-any.whl (23.2 kB view details)

Uploaded Dec 24, 2025 Python 3

File details

Details for the file lazzaro-0.1.2.1.tar.gz.

File metadata

Download URL: lazzaro-0.1.2.1.tar.gz
Upload date: Dec 24, 2025
Size: 24.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for lazzaro-0.1.2.1.tar.gz
Algorithm	Hash digest
SHA256	`828fdedae94105f14179b020b4021a2d3c77421929396563435dccb12bd45141`
MD5	`acf3cd83f84996456cc15a604aa7ca73`
BLAKE2b-256	`5ecb980ad84c04b1f40d27e7ec5dcaddd92da3873a941279fb49500c77a0f7ae`

See more details on using hashes here.

File details

Details for the file lazzaro-0.1.2.1-py3-none-any.whl.

File metadata

Download URL: lazzaro-0.1.2.1-py3-none-any.whl
Upload date: Dec 24, 2025
Size: 23.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for lazzaro-0.1.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`39dea7f3515aaa2bb606de36c36cf3b1e9a9e9d4908ac2e2ce01499b83b291a4`
MD5	`f12589b2a92c8688307b747a527688c0`
BLAKE2b-256	`a9b28ce77e82fcf951aa83fc7bdf9cae3224324e498262c39a7f2cc40215eb7a`

See more details on using hashes here.

lazzaro 0.1.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Lazzaro

Installation

How It Works

1. Architecture

2. Memory Lifecycle

3. Hierarchy & Super-Nodes

Usage

CLI (Interactive Mode)

Python API

Framework Integration

Using LangChain

Configuration

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes