Skip to main content

An integration package connecting Databricks and LangChain

Project description

🦜️🔗 LangChain Databricks

This repository provides LangChain components to connect your LangChain application with various Databricks services.

Features

  • 🤖 LLMs: The ChatDatabricks component allows you to access chat endpoints hosted on Databricks Model Serving, including state-of-the-art models such as Llama3, Mixtral, and DBRX, as well as your own fine-tuned models.
  • 📐 Vector Store: Databricks Vector Search is a serverless similarity search engine that allows you to store a vector representation of your data, including metadata, in a vector database. With Vector Search, you can create auto-updating vector search indexes from Delta tables managed by Unity Catalog and query them with a simple API to return the most similar vectors.
  • 🔢 Embeddings: Provides components for working with embedding models hosted on Databricks Model Serving.
  • 📊 MLflow Integration: LangChain Databricks components is fully integrated with MLflow, providing various LLMOps capabilities such as experiment tracking, dependency management, evaluation, and tracing (observability).

Note: This repository will replace all Databricks integrations currently present in the langchain-community package. Users are encouraged to migrate to this repository as soon as possible.

Installation

You can install the langchain-databricks package from PyPI.

pip install -U langchain-databricks

If you are using this package outside Databricks workspace, you should configure credentials by setting the following environment variables:

export DATABRICKS_HOSTNAME="https://your-databricks-workspace"
export DATABRICKS_TOKEN="your-personal-access-token"

Instead of personal access token (PAT), you can also use OAuth M2M authentication:

export DATABRICKS_HOSTNAME="https://your-databricks-workspace"
export DATABRICKS_CLIENT_ID="your-service-principle-client-id"
export DATABRICKS_CLIENT_SECRET="your-service-principle-secret"

Chat Models

ChatDatabricks is a Chat Model class to access chat endpoints hosted on Databricks, including state-of-the-art models such as Llama3, Mixtral, and DBRX, as well as your own fine-tuned models.

from langchain_databricks import ChatDatabricks

chat_model = ChatDatabricks(endpoint="databricks-meta-llama-3-70b-instruct")
chat_model.invoke("Sing a ballad of LangChain.")

See the usage example for more guidance on how to use it within your LangChain application.

Note: The LLM class Databricks still lives in the langchain-community library. However, this class will be deprecated in the future and it is recommended to use ChatDatabricks to get the latest features.

Embeddings

DatabricksEmbeddings is an Embeddings class to access text-embedding endpoints hosted on Databricks, including state-of-the-art models such as BGE, as well as your own fine-tuned models.

from langchain_databricks import DatabricksEmbeddings

embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en")

See the usage example for more guidance on how to use it within your LangChain application.

Vector Search

Databricks Vector Search is a serverless similarity search engine that allows you to store a vector representation of your data, including metadata, in a vector database. With Vector Search, you can create auto-updating vector search indexes from Delta tables managed by Unity Catalog and query them with a simple API to return the most similar vectors.

from langchain_databricks.vectorstores import DatabricksVectorSearch

dvs = DatabricksVectorSearch(
    index_name="<YOUR_INDEX_NAME>",
    text_column="text",
    columns=["source"]
)
docs = dvs.similarity_search("What is vector search?")

See the usage example for how to set up vector indices and integrate them with LangChain.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_databricks-0.1.0.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

langchain_databricks-0.1.0-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file langchain_databricks-0.1.0.tar.gz.

File metadata

  • Download URL: langchain_databricks-0.1.0.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for langchain_databricks-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9974c8ebef45ef205c91da5394def06b2ad638760ff3dc0a77f59e2f6fb3f0d8
MD5 08b24cbc34cd5ea038053da42fe5d146
BLAKE2b-256 2b5ee12f87e7df1ce3aaa6e446a6dd6d65c8ba5a362a131e6568ee3b8a47e42d

See more details on using hashes here.

File details

Details for the file langchain_databricks-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_databricks-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 44883e412edd560240a5cc45e99dfbc4a9fafad247bc98d3ae62a850e8cf0ef8
MD5 2ff7cb59dc0fbc87eb70dc6a2037e62a
BLAKE2b-256 266334d99d95f8047dec198ae9c3660606e8b19a1248f4351a44524966fc78e1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page