Skip to main content

Unity Catalog-native episodic, semantic, and working memory for AI agents on Databricks

Project description

lakehouse-memory

Unity Catalog-native episodic, semantic, and working memory for AI agents on Databricks.

Status: Alpha. Public from day one. v0.1.0 is the first releasable cut. See the spec for design intent.

The pitch

Memory is the missing Databricks layer. The standard workaround is a sidecar vector DB with its own governance, access control, and lineage — a system you can't ship. Memory belongs in Unity Catalog, where your data already lives.

lakehouse-memory gives AI agents on Databricks three first-class memory primitives — episodic, semantic, and working — backed by Unity Catalog tables and Databricks Vector Search.

Install

pip install lakehouse-memory

Quickstart

from lakehouse_memory import Memory, MemoryConfig, Scope
from lakehouse_memory.client import SqlConnectorClient
from lakehouse_memory.vector_databricks import DatabricksVectorIndex
import os

config = MemoryConfig(catalog="main", schema_name="agent_memory")

client = SqlConnectorClient(
    server_hostname=os.environ["DATABRICKS_HOST"].replace("https://", ""),
    http_path=os.environ["DATABRICKS_HTTP_PATH"],
    access_token=os.environ["DATABRICKS_TOKEN"],
)

index = DatabricksVectorIndex(
    endpoint_name=os.environ["DATABRICKS_VECTOR_SEARCH_ENDPOINT"],
    index_name=f"{config.catalog}.{config.schema_name}.episodic_idx",
    workspace_url=os.environ["DATABRICKS_HOST"],
    access_token=os.environ["DATABRICKS_TOKEN"],
    columns=["event_id", "text", "user_id", "session_id", "agent_id"],
)

mem = Memory(config=config, client=client, index=index, scope=Scope(user_id="u_1"))
mem.provision(
    vector_search_endpoint=os.environ["DATABRICKS_VECTOR_SEARCH_ENDPOINT"],
    workspace_url=os.environ["DATABRICKS_HOST"],
    access_token=os.environ["DATABRICKS_TOKEN"],
)

# Write a fact
mem.semantic.upsert(fact="User prefers SQL over Python.")

# Delta Sync indexes are TRIGGERED — explicitly fire the sync after writes.
# (For production, consider switching to CONTINUOUS pipelines.)
mem.semantic._index.trigger_sync()

# Wait for sync; production code would use exponential backoff
import time; time.sleep(15)

facts = mem.semantic.retrieve("language preferences", k=3)

LangChain integration:

chat = mem.as_langchain_chat_history(limit=50)
retriever = mem.as_langchain_retriever(k=5)

Production gaps

(Coming in M4. Short version: compaction at scale, multi-tenant RLS, regression evals, observability, and custom retrieval strategies are deliberately not in OSS. If you want help building past those, the Burmaster Databricks AI Practice does this for a living.)

License

Apache 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lakehouse_memory-0.1.0b1.tar.gz (31.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lakehouse_memory-0.1.0b1-py3-none-any.whl (22.5 kB view details)

Uploaded Python 3

File details

Details for the file lakehouse_memory-0.1.0b1.tar.gz.

File metadata

  • Download URL: lakehouse_memory-0.1.0b1.tar.gz
  • Upload date:
  • Size: 31.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for lakehouse_memory-0.1.0b1.tar.gz
Algorithm Hash digest
SHA256 7c788c873b91f1a917d6a1d78c42133acbd3fe153f24d8a393d8567399317136
MD5 28b87f40d85676a547383781987bebb0
BLAKE2b-256 38eb04ce21b18162af5f0d0f30f5b123be2d2bd890bf676d265462a01d821869

See more details on using hashes here.

File details

Details for the file lakehouse_memory-0.1.0b1-py3-none-any.whl.

File metadata

File hashes

Hashes for lakehouse_memory-0.1.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 adf120c3488793413d8152796c6f82dfc585c09886c26031d1fa2b00d667b784
MD5 59c35f54d6dcd159878c15ea13660f8b
BLAKE2b-256 efadf0307c4434eefa4c8357b6d3147e63f4f842953106d082efd1b085846bf4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page