A scalable memory system for AI agents using graph-based sharding and hierarchical clustering.
Project description
Lazzaro
Scalable Memory System Library for AI Agents
Lazzaro is a high-performance Python library for long-term, structured agent memory. It goes beyond simple vector search by implementing a graph-based architecture with semantic sharding, hierarchical clustering, and biological-inspired decay. It evolves a multi-domain user profile through continuous conversation consolidation.
Installation
pip install lazzaro
- Extensions:
google-generativeai,together(LLMs),langchain-core(Integrations),matplotlib(Visualization).
Core Architecture
Lazzaro organizes memory as a multi-layered graph optimized for scale:
- Semantic Shards: Topic-based subgraphs (e.g., "coding", "health") that provide natural isolation and accelerated retrieval.
- LanceDB Backed: Persistent storage of the entire graph (nodes, edges, profile) with sub-millisecond vector performance.
- Hierarchical super-nodes: Summary nodes representing large clusters, allowing for abstract reasoning and fast top-down search.
- Multi-User Factory: Native multi-tenant support with B-Tree optimized user partitioning.
Memory Lifecycle
- Episodic Buffer: Immediate caching of conversations for short-term context.
- Background Consolidation: Asynchronous LLM extraction of atomic facts, deduplication via LanceDB, and associative graph linking.
- Temporal Decay & Pruning: Biological-inspired pruning where weak associations fade and salience decays non-linearly to prevent memory bloat.
User Profile & Retrieval
Lazzaro evolves a structured persona across five domains (Preferences, Personality, Knowledge, Style, and Experiences). Retrieval uses a hybrid approach:
- Sharded Semantic Search: Narrowing the search space to relevant subgraphs.
- Associative Boosting: High-salience nodes pull their neighbors into the current context buffer.
- Optimized Retrieval: Combines vector similarity with hierarchical pathing and frequency metrics.
Usage
Provider Configuration
from lazzaro.core.memory_system import MemorySystem
from lazzaro.core.providers import GeminiLLM, GeminiEmbedder
# Initialize providers
llm = GeminiLLM(api_key="API_KEY", model="gemini-1.5-flash")
embedder = GeminiEmbedder(api_key="API_KEY")
# Initialize Memory System
ms = MemorySystem(
llm_provider=llm,
embedding_provider=embedder,
enable_sharding=True,
enable_hierarchy=True,
max_buffer_size=100
)
# Chat with built-in memory retrieval
ms.start_conversation()
response = ms.chat("I'm working on a Rust project and I prefer using async-std.")
print(response)
# Finalize and trigger background consolidation
print(ms.end_conversation())
Visual Dashboard
For a high-fidelity, interactive experience, Lazzaro includes a custom web-based dashboard:
lazzaro-dashboard
The dashboard will be available at http://localhost:5299 and features:
- Live Force-Graph: Interactive visualization of your memory shards and node relationships.
- Real-time Metrics: Monitor LLM calls, embedding costs, and retrieval latency.
- Profile Explorer: View your evolved user persona domains in a sleek side drawer.
Integrations
LangChain
from lazzaro.integrations import LazzaroLangChainMemory
from langchain.chains import ConversationChain
memory = LazzaroLangChainMemory(memory_system=ms)
chain = ConversationChain(llm=chat_model, memory=memory)
LangGraph
from lazzaro.integrations import LazzaroLangGraph
lg = LazzaroLangGraph(ms)
builder.add_node("retrieve", lg.get_memory_node())
builder.add_node("record", lg.get_record_node())
CLI Reference
Launch the interactive shell:
lazzaro-cli
Command Table
| Command | Description |
|---|---|
/start |
Manual session initialization. |
/end |
Manual session termination and consolidation trigger. |
/stats |
Display node counts, shard density, and performance metrics. |
/profile |
View evolved user profile data. |
/memories [n] |
Inspect the n most recent memory nodes. |
/consolidate |
Force immediate graph-wide consolidation. |
/merge |
Manually trigger semantic deduplication of similar nodes. |
/prune [t] |
Remove edges with weights below threshold t (default: 0.5). |
/config |
View and modify runtime parameters. |
/save [file] |
Export current state to JSON. |
/load [file] |
Import state from JSON. |
Parameter Reference
| Parameter | Default | Description |
|---|---|---|
auto_consolidate |
True |
Extract facts after every N conversations. |
consolidate_every |
3 |
Conversation frequency for consolidation. |
max_buffer_size |
10 |
Total nodes allowed before archiving. |
enable_async |
True |
Background thread processing for consolidation. |
enable_sharding |
True |
Use topic-based subgraph isolation. |
prune_threshold |
0.5 |
Minimum weight to retain an edge. |
load_from_disk |
True |
Automatically restore state from LanceDB on startup. |
db_dir |
"db" |
Directory for LanceDB persistence. |
Persistence and Safety
- LanceDB Native Persistence: Lazzaro maintains its entire state (Graph + Vector + Profile) within LanceDB tables inside the
db/directory. - Atomic Updates: Database operations are atomic, preventing state corruption during unexpected shutdowns.
- Version Control: LanceDB's internal versioning allows for reliable multi-process access and synchronization.
- JSON Export: Human-readable snapshots can be exported using the
/savecommand orsave_state()method for easy debugging and porting.
Development
Run tests:
pytest tests/
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lazzaro-0.2.5.2.tar.gz.
File metadata
- Download URL: lazzaro-0.2.5.2.tar.gz
- Upload date:
- Size: 47.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9a1e3ab8d9d29ab93c28d5606c531d1f46d24e5bfd10b76c36887f286f889ad
|
|
| MD5 |
4495b5e8bf6abdf10db4c9e94565207e
|
|
| BLAKE2b-256 |
f623950b5e2d51306154891e02a134594ea11ea348a4f32984aff852b2a0f6f9
|
File details
Details for the file lazzaro-0.2.5.2-py3-none-any.whl.
File metadata
- Download URL: lazzaro-0.2.5.2-py3-none-any.whl
- Upload date:
- Size: 45.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e19e77abae77d4416fce6703df2ebde14dbbb3fe896e31f2bdbcdf71774fb6f2
|
|
| MD5 |
0c42856163bf392504e212db6cd8a872
|
|
| BLAKE2b-256 |
b728a058638f6b76be703ece3dbcc56a57bd84f1a21dd596f886218fecababee
|