Skip to main content

A scalable memory system for AI agents using graph-based sharding and hierarchical clustering.

Project description

Lazzaro

Scalable Memory System Library for AI Agents

Lazzaro is a high-performance Python library for long-term, structured agent memory. It goes beyond simple vector search by implementing a graph-based architecture with semantic sharding, hierarchical clustering, and biological-inspired decay. It evolves a multi-domain user profile through continuous conversation consolidation.

Installation

pip install lazzaro
  • Extensions: google-generativeai, together (LLMs), langchain-core (Integrations), matplotlib (Visualization).

Core Architecture

Lazzaro organizes memory as a multi-layered graph optimized for scale:

  • Semantic Shards: Topic-based subgraphs (e.g., "coding", "health") that provide natural isolation and accelerated retrieval.
  • LanceDB Backed: Persistent storage of the entire graph (nodes, edges, profile) with sub-millisecond vector performance.
  • Hierarchical super-nodes: Summary nodes representing large clusters, allowing for abstract reasoning and fast top-down search.
  • Multi-User Factory: Native multi-tenant support with B-Tree optimized user partitioning.

Memory Lifecycle

  1. Episodic Buffer: Immediate caching of conversations for short-term context.
  2. Background Consolidation: Asynchronous LLM extraction of atomic facts, deduplication via LanceDB, and associative graph linking.
  3. Temporal Decay & Pruning: Biological-inspired pruning where weak associations fade and salience decays non-linearly to prevent memory bloat.

User Profile & Retrieval

Lazzaro evolves a structured persona across five domains (Preferences, Personality, Knowledge, Style, and Experiences). Retrieval uses a hybrid approach:

  • Sharded Semantic Search: Narrowing the search space to relevant subgraphs.
  • Associative Boosting: High-salience nodes pull their neighbors into the current context buffer.
  • Optimized Retrieval: Combines vector similarity with hierarchical pathing and frequency metrics.

Usage

Provider Configuration

from lazzaro.core.memory_system import MemorySystem
from lazzaro.core.providers import GeminiLLM, GeminiEmbedder

# Initialize providers
llm = GeminiLLM(api_key="API_KEY", model="gemini-1.5-flash")
embedder = GeminiEmbedder(api_key="API_KEY")

# Initialize Memory System
ms = MemorySystem(
    llm_provider=llm, 
    embedding_provider=embedder,
    enable_sharding=True,
    enable_hierarchy=True,
    max_buffer_size=100
)

# Chat with built-in memory retrieval
ms.start_conversation()
response = ms.chat("I'm working on a Rust project and I prefer using async-std.")
print(response)

# Finalize and trigger background consolidation
print(ms.end_conversation())

Visual Dashboard

For a high-fidelity, interactive experience, Lazzaro includes a custom web-based dashboard:

lazzaro-dashboard

Lazzaro Dashboard Preview

The dashboard will be available at http://localhost:5299 and features:

  • Live Force-Graph: Interactive visualization of your memory shards and node relationships.
  • Real-time Metrics: Monitor LLM calls, embedding costs, and retrieval latency.
  • Profile Explorer: View your evolved user persona domains in a sleek side drawer.

Integrations

LangChain

from lazzaro.integrations import LazzaroLangChainMemory
from langchain.chains import ConversationChain

memory = LazzaroLangChainMemory(memory_system=ms)
chain = ConversationChain(llm=chat_model, memory=memory)

LangGraph

from lazzaro.integrations import LazzaroLangGraph

lg = LazzaroLangGraph(ms)
builder.add_node("retrieve", lg.get_memory_node())
builder.add_node("record", lg.get_record_node())

CLI Reference

Launch the interactive shell:

lazzaro-cli

Command Table

Command Description
/start Manual session initialization.
/end Manual session termination and consolidation trigger.
/stats Display node counts, shard density, and performance metrics.
/profile View evolved user profile data.
/memories [n] Inspect the n most recent memory nodes.
/consolidate Force immediate graph-wide consolidation.
/merge Manually trigger semantic deduplication of similar nodes.
/prune [t] Remove edges with weights below threshold t (default: 0.5).
/config View and modify runtime parameters.
/save [file] Export current state to JSON.
/load [file] Import state from JSON.

Parameter Reference

Parameter Default Description
auto_consolidate True Extract facts after every N conversations.
consolidate_every 3 Conversation frequency for consolidation.
max_buffer_size 10 Total nodes allowed before archiving.
enable_async True Background thread processing for consolidation.
enable_sharding True Use topic-based subgraph isolation.
prune_threshold 0.5 Minimum weight to retain an edge.
load_from_disk True Automatically restore state from LanceDB on startup.
db_dir "db" Directory for LanceDB persistence.

Persistence and Safety

  • LanceDB Native Persistence: Lazzaro maintains its entire state (Graph + Vector + Profile) within LanceDB tables inside the db/ directory.
  • Atomic Updates: Database operations are atomic, preventing state corruption during unexpected shutdowns.
  • Version Control: LanceDB's internal versioning allows for reliable multi-process access and synchronization.
  • JSON Export: Human-readable snapshots can be exported using the /save command or save_state() method for easy debugging and porting.

Development

Run tests:

pytest tests/

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lazzaro-0.2.5.3.tar.gz (47.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lazzaro-0.2.5.3-py3-none-any.whl (45.9 kB view details)

Uploaded Python 3

File details

Details for the file lazzaro-0.2.5.3.tar.gz.

File metadata

  • Download URL: lazzaro-0.2.5.3.tar.gz
  • Upload date:
  • Size: 47.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for lazzaro-0.2.5.3.tar.gz
Algorithm Hash digest
SHA256 b31efdd6581d8fd5387511915c1ed2f5ffdbc2769c9e678697a2acd50a7a9f99
MD5 a1f16a33b89086f440d0a4d41f8c6a21
BLAKE2b-256 6770c98e29829546d8380ca136c6db09fc9a74274d93072f9dc5f048f5da0a34

See more details on using hashes here.

File details

Details for the file lazzaro-0.2.5.3-py3-none-any.whl.

File metadata

  • Download URL: lazzaro-0.2.5.3-py3-none-any.whl
  • Upload date:
  • Size: 45.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for lazzaro-0.2.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7fc446c0a50c3f31db986278b59766bea978f4293c856aed2ccd51b8f5a4612b
MD5 df40043ea1c6964a9e7742cf316a17e6
BLAKE2b-256 5157db00f2147abee8d540e1408bcfb2a5e9233bcc2a41abce93851ae0178d51

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page