Skip to main content

Unity Catalog-native episodic, semantic, and working memory for AI agents on Databricks

Project description

lakehouse-memory

PyPI License: Apache 2.0 CI Docs

Unity Catalog-native episodic, semantic, and working memory for AI agents on Databricks.

Status: Stable (0.1.0). Public from day one. The core library, LangChain adapters, DAB starter (M3), and docs site (M4) are shipped. See the docs for full documentation.

The pitch

Memory is the missing Databricks layer. The standard workaround is a sidecar vector DB with its own governance, access control, and lineage — a system you can't ship. Memory belongs in Unity Catalog, where your data already lives.

lakehouse-memory gives AI agents on Databricks three first-class memory primitives — episodic, semantic, and working — backed by Unity Catalog tables and Databricks Vector Search.

Install

pip install lakehouse-memory

Migrating from a pre-release: Memory(index=...) was removed in 0.1.0. Use Memory(config, client, episodic_index=idx, semantic_index=idx) or, preferably, Memory.from_databricks(...).

Quickstart with the DAB starter (recommended)

Bootstrap the whole reference architecture — UC tables, Vector Search indexes, and a working chat agent — in your Databricks workspace:

databricks bundle init https://github.com/travis-burmaster/lakehouse-memory \
  --template-dir templates/lakehouse-memory-bundle \
  --output-dir my-memory-demo
cd my-memory-demo
databricks bundle deploy
databricks bundle run setup_job

You'll be prompted for your catalog, schema, Vector Search endpoint, SQL warehouse HTTP path, and LLM serving endpoint. (--output-dir is the project root itself, not a parent directory.) After setup_job finishes, open notebooks/02_chat_agent.ipynb and run all cells.

The setup job typically takes ~15 minutes end-to-end: the bulk is the library install plus the one-time provisioning of two Delta Sync Vector Search indexes (which sync serially on workspaces with a single-pipeline quota). Subsequent runs against an already-provisioned schema are fast.

Manual setup (advanced)

from lakehouse_memory import Memory, MemoryConfig, Scope
from lakehouse_memory.client import SqlConnectorClient
from lakehouse_memory.vector_databricks import DatabricksVectorIndex
import os

config = MemoryConfig(catalog="main", schema_name="agent_memory")

client = SqlConnectorClient(
    server_hostname=os.environ["DATABRICKS_HOST"].replace("https://", ""),
    http_path=os.environ["DATABRICKS_HTTP_PATH"],
    access_token=os.environ["DATABRICKS_TOKEN"],
)

index = DatabricksVectorIndex(
    endpoint_name=os.environ["DATABRICKS_VECTOR_SEARCH_ENDPOINT"],
    index_name=f"{config.catalog}.{config.schema_name}.episodic_idx",
    workspace_url=os.environ["DATABRICKS_HOST"],
    access_token=os.environ["DATABRICKS_TOKEN"],
    columns=["event_id", "text", "user_id", "session_id", "agent_id"],
)

mem = Memory(config=config, client=client, index=index, scope=Scope(user_id="u_1"))
mem.provision(
    vector_search_endpoint=os.environ["DATABRICKS_VECTOR_SEARCH_ENDPOINT"],
    workspace_url=os.environ["DATABRICKS_HOST"],
    access_token=os.environ["DATABRICKS_TOKEN"],
)

# Write a fact
mem.semantic.upsert(fact="User prefers SQL over Python.")

# Delta Sync indexes are TRIGGERED — explicitly fire the sync after writes.
# (For production, consider switching to CONTINUOUS pipelines.)
mem.semantic._index.trigger_sync()

# Wait for sync; production code would use exponential backoff
import time; time.sleep(15)

facts = mem.semantic.retrieve("language preferences", k=3)

LangChain integration:

chat = mem.as_langchain_chat_history(limit=50)
retriever = mem.as_langchain_retriever(k=5)

Production gaps

(Coming in M4. Short version: compaction at scale, multi-tenant RLS, regression evals, observability, and custom retrieval strategies are deliberately not in OSS. If you want help building past those, the Burmaster Databricks AI Practice does this for a living.)

License

Apache 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lakehouse_memory-0.1.0.tar.gz (43.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lakehouse_memory-0.1.0-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file lakehouse_memory-0.1.0.tar.gz.

File metadata

  • Download URL: lakehouse_memory-0.1.0.tar.gz
  • Upload date:
  • Size: 43.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lakehouse_memory-0.1.0.tar.gz
Algorithm Hash digest
SHA256 343cad92c7eeb75439cfcacacf88db74d775dc386d3bd38a20464232821e9e7d
MD5 ab9a1c08f457b7d78c6ce5b250656972
BLAKE2b-256 2d28900f19907526fc0efece59f815083df426fe4e9eb1e1ebb7b8d3863c6202

See more details on using hashes here.

Provenance

The following attestation bundles were made for lakehouse_memory-0.1.0.tar.gz:

Publisher: publish.yml on travis-burmaster/lakehouse-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file lakehouse_memory-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for lakehouse_memory-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b9cb371dad92e2583c1dd050f4cbf5e5d19deb0e09f056aacb217b1ad5af3c7e
MD5 1137f2209a6346edfd142dced5ca49f7
BLAKE2b-256 f8ef557fa8da4fbcb5d65219bf41af79fc026068db70a79ba1e20b0b947f1e2a

See more details on using hashes here.

Provenance

The following attestation bundles were made for lakehouse_memory-0.1.0-py3-none-any.whl:

Publisher: publish.yml on travis-burmaster/lakehouse-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page