Skip to main content

Unity Catalog-native episodic, semantic, and working memory for AI agents on Databricks

Project description

lakehouse-memory

PyPI License: Apache 2.0 CI

Unity Catalog-native episodic, semantic, and working memory for AI agents on Databricks.

Status: Pre-release (0.1.0b1). Public from day one. The core library and LangChain adapters are workspace-validated; the DAB starter (M3) and docs site (M4) are not yet shipped. See the spec for design intent.

The pitch

Memory is the missing Databricks layer. The standard workaround is a sidecar vector DB with its own governance, access control, and lineage — a system you can't ship. Memory belongs in Unity Catalog, where your data already lives.

lakehouse-memory gives AI agents on Databricks three first-class memory primitives — episodic, semantic, and working — backed by Unity Catalog tables and Databricks Vector Search.

Install

pip install --pre lakehouse-memory

The --pre flag is required while the package is in pre-release. Once 0.1.0 ships (alongside the M3 DAB starter and M4 docs), pip install lakehouse-memory will work without the flag.

Quickstart

from lakehouse_memory import Memory, MemoryConfig, Scope
from lakehouse_memory.client import SqlConnectorClient
from lakehouse_memory.vector_databricks import DatabricksVectorIndex
import os

config = MemoryConfig(catalog="main", schema_name="agent_memory")

client = SqlConnectorClient(
    server_hostname=os.environ["DATABRICKS_HOST"].replace("https://", ""),
    http_path=os.environ["DATABRICKS_HTTP_PATH"],
    access_token=os.environ["DATABRICKS_TOKEN"],
)

index = DatabricksVectorIndex(
    endpoint_name=os.environ["DATABRICKS_VECTOR_SEARCH_ENDPOINT"],
    index_name=f"{config.catalog}.{config.schema_name}.episodic_idx",
    workspace_url=os.environ["DATABRICKS_HOST"],
    access_token=os.environ["DATABRICKS_TOKEN"],
    columns=["event_id", "text", "user_id", "session_id", "agent_id"],
)

mem = Memory(config=config, client=client, index=index, scope=Scope(user_id="u_1"))
mem.provision(
    vector_search_endpoint=os.environ["DATABRICKS_VECTOR_SEARCH_ENDPOINT"],
    workspace_url=os.environ["DATABRICKS_HOST"],
    access_token=os.environ["DATABRICKS_TOKEN"],
)

# Write a fact
mem.semantic.upsert(fact="User prefers SQL over Python.")

# Delta Sync indexes are TRIGGERED — explicitly fire the sync after writes.
# (For production, consider switching to CONTINUOUS pipelines.)
mem.semantic._index.trigger_sync()

# Wait for sync; production code would use exponential backoff
import time; time.sleep(15)

facts = mem.semantic.retrieve("language preferences", k=3)

LangChain integration:

chat = mem.as_langchain_chat_history(limit=50)
retriever = mem.as_langchain_retriever(k=5)

Production gaps

(Coming in M4. Short version: compaction at scale, multi-tenant RLS, regression evals, observability, and custom retrieval strategies are deliberately not in OSS. If you want help building past those, the Burmaster Databricks AI Practice does this for a living.)

License

Apache 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lakehouse_memory-0.1.0b2.tar.gz (32.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lakehouse_memory-0.1.0b2-py3-none-any.whl (22.8 kB view details)

Uploaded Python 3

File details

Details for the file lakehouse_memory-0.1.0b2.tar.gz.

File metadata

  • Download URL: lakehouse_memory-0.1.0b2.tar.gz
  • Upload date:
  • Size: 32.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lakehouse_memory-0.1.0b2.tar.gz
Algorithm Hash digest
SHA256 6014783e6d23d87febb8efb915d42f4d90aab04fd9b2a723f636eb285d203aae
MD5 cec6b9f434ce53a2ba3bf12613bcba89
BLAKE2b-256 d6c35b194d85ffec0001abb1c50f12c1b4c8cb5ab9d6c7463bb04f7acbe9f6b2

See more details on using hashes here.

Provenance

The following attestation bundles were made for lakehouse_memory-0.1.0b2.tar.gz:

Publisher: publish.yml on travis-burmaster/lakehouse-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file lakehouse_memory-0.1.0b2-py3-none-any.whl.

File metadata

File hashes

Hashes for lakehouse_memory-0.1.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 144b817ffb3d3dc29d92305feeaa118be667c2f81f1181bb25478103bc328c25
MD5 5a225780c3b91de05a3dbedfffd4a21f
BLAKE2b-256 827716ab176ae35fdb887d11430c311de4a95ec3991631b1d63bb000e9065e17

See more details on using hashes here.

Provenance

The following attestation bundles were made for lakehouse_memory-0.1.0b2-py3-none-any.whl:

Publisher: publish.yml on travis-burmaster/lakehouse-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page