Skip to main content

An integration package connecting Databricks and LangChain

Project description

🦜️🔗 LangChain Databricks

This repository provides LangChain components to connect your LangChain application with various Databricks services.

Features

  • 🤖 LLMs: The ChatDatabricks component allows you to access chat endpoints hosted on Databricks Model Serving, including state-of-the-art models such as Llama3, Mixtral, and DBRX, as well as your own fine-tuned models.
  • 📐 Vector Store: Databricks Vector Search is a serverless similarity search engine that allows you to store a vector representation of your data, including metadata, in a vector database. With Vector Search, you can create auto-updating vector search indexes from Delta tables managed by Unity Catalog and query them with a simple API to return the most similar vectors.
  • 🔢 Embeddings: Provides components for working with embedding models hosted on Databricks Model Serving.
  • 📊 MLflow Integration: LangChain Databricks components is fully integrated with MLflow, providing various LLMOps capabilities such as experiment tracking, dependency management, evaluation, and tracing (observability).

Note: This repository will replace all Databricks integrations currently present in the langchain-community package. Users are encouraged to migrate to this repository as soon as possible.

Installation

You can install the langchain-databricks package from PyPI.

pip install -U langchain-databricks

If you are using this package outside Databricks workspace, you should configure credentials by setting the following environment variables:

export DATABRICKS_HOSTNAME="https://your-databricks-workspace"
export DATABRICKS_TOKEN="your-personal-access-token"

Instead of personal access token (PAT), you can also use OAuth M2M authentication:

export DATABRICKS_HOSTNAME="https://your-databricks-workspace"
export DATABRICKS_CLIENT_ID="your-service-principle-client-id"
export DATABRICKS_CLIENT_SECRET="your-service-principle-secret"

Chat Models

ChatDatabricks is a Chat Model class to access chat endpoints hosted on Databricks, including state-of-the-art models such as Llama3, Mixtral, and DBRX, as well as your own fine-tuned models.

from langchain_databricks import ChatDatabricks

chat_model = ChatDatabricks(endpoint="databricks-meta-llama-3-70b-instruct")
chat_model.invoke("Sing a ballad of LangChain.")

See the usage example for more guidance on how to use it within your LangChain application.

Note: The LLM class Databricks still lives in the langchain-community library. However, this class will be deprecated in the future and it is recommended to use ChatDatabricks to get the latest features.

Embeddings

DatabricksEmbeddings is an Embeddings class to access text-embedding endpoints hosted on Databricks, including state-of-the-art models such as BGE, as well as your own fine-tuned models.

from langchain_databricks import DatabricksEmbeddings

embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en")

See the usage example for more guidance on how to use it within your LangChain application.

Vector Search

Databricks Vector Search is a serverless similarity search engine that allows you to store a vector representation of your data, including metadata, in a vector database. With Vector Search, you can create auto-updating vector search indexes from Delta tables managed by Unity Catalog and query them with a simple API to return the most similar vectors.

from langchain_databricks.vectorstores import DatabricksVectorSearch

dvs = DatabricksVectorSearch(
    index_name="<YOUR_INDEX_NAME>",
    text_column="text",
    columns=["source"]
)
docs = dvs.similarity_search("What is vector search?")

See the usage example for how to set up vector indices and integrate them with LangChain.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_databricks-0.1.1.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

langchain_databricks-0.1.1-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file langchain_databricks-0.1.1.tar.gz.

File metadata

  • Download URL: langchain_databricks-0.1.1.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for langchain_databricks-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6c536eb0a53902dee5a9c0390b1aa76ae52b34f43642e5368c4c986db4afe3b9
MD5 cd4d4c8a33dfa346bd71974beac07e34
BLAKE2b-256 183463df164dc785bf8ebda783b83052359aa7fce94557b5475f6755a1f5a8e4

See more details on using hashes here.

File details

Details for the file langchain_databricks-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_databricks-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e49e08fe3a634f16bb4fe142f95fb9c7b07c4c79adb5a72ffde466eaaa7c578e
MD5 28ee02e31194a33146d486930a7a6381
BLAKE2b-256 1b259c5db82519c9c0816834c64317bf311614545a3ee221c009a9c3dbb91f07

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page