Skip to main content

An integration package connecting LakeFS and LangChain

Project description

langchain-lakefs

This package provides a LangChain integration with lakeFS, allowing you to load documents from lakeFS repositories into your LangChain workflows.

Features

  • Load documents from lakeFS repositories using the official lakeFS Python SDK
  • Support for user metadata retrieval
  • Configurable repository, reference, and path specifications
  • Integration with LangChain's document loading infrastructure

Installation

pip install -U langchain-lakefs

Configuration

You can configure the LakeFSLoader in three ways:

1. Direct Initialization

Provide the access key, secret key, and endpoint during initialization:

from langchain_lakefs.document_loaders import LakeFSLoader

lakefs_loader = LakeFSLoader(
    lakefs_access_key='your_access_key',
    lakefs_secret_key='your_secret_key',
    lakefs_endpoint='https://path-to.lakefs.com',
    repo='your_repo',
    ref='main',
    path='path/to/files'
)

2. Configuration File

The package will automatically read credentials from the ~/.lakectl.yaml file if available.

3. Environment Variables

Set the following environment variables to configure the loader:

export LAKECTL_CREDENTIALS_ACCESS_KEY_ID='your_access_key'
export LAKECTL_CREDENTIALS_SECRET_ACCESS_KEY='your_secret_key'
export LAKECTL_SERVER_ENDPOINT_URL='https://path-to.lakefs.com'

Usage

Document Loader

The LakeFSLoader class allows you to load documents from lakeFS. You need to specify:

  • The repository (repo)
  • The reference (ref) - branch, commit or tag
  • The path to the files you want to load

If you would like to load the metadata of the files, you can set the user_metadata parameter to True:

from langchain_lakefs.document_loaders import LakeFSLoader

# Initialize the loader
lakefs_loader = LakeFSLoader(
    lakefs_access_key='your_access_key',
    lakefs_secret_key='your_secret_key',
    lakefs_endpoint='https://path-to.lakefs.com',
    repo='your_repo',
    ref='main',
    path='path/to/files',
    user_metadata=True
)

# Load documents from lakeFS
documents = lakefs_loader.load()

# Process the documents
for doc in documents:
    print(f"Content: {doc.page_content}")
    print(f"Metadata: {doc.metadata}")

Modifying Loader Settings

You can modify the loader settings after initialization:

# Change the repository
lakefs_loader.set_repo("another-repo")

# Change the reference (branch or commit)
lakefs_loader.set_ref("feature-branch")

# Change the path
lakefs_loader.set_path("another/path")

# Toggle user metadata retrieval
lakefs_loader.set_user_metadata(True)

Examples

Loading Documents from a Specific Path

from langchain_lakefs.document_loaders import LakeFSLoader

loader = LakeFSLoader(
    lakefs_endpoint="https://example.my-lakefs.com",
    lakefs_access_key="your-access-key",
    lakefs_secret_key="your-secret-key",
    repo="my-repo",
    ref="main",
    path="data/documents"
)

documents = loader.load()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_lakefs-0.1.1.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_lakefs-0.1.1-py3-none-any.whl (16.9 kB view details)

Uploaded Python 3

File details

Details for the file langchain_lakefs-0.1.1.tar.gz.

File metadata

  • Download URL: langchain_lakefs-0.1.1.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langchain_lakefs-0.1.1.tar.gz
Algorithm Hash digest
SHA256 29e50b6364aa5a976da583a1f049eb6c8dd4980055aac0a509c9bc88424ce658
MD5 8031b2917339a72bb8d2bedc858b2a66
BLAKE2b-256 35df30b28aaa5ca514ba6db6a1e787981c55691e8d56045f53dfb93685c401fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_lakefs-0.1.1.tar.gz:

Publisher: publish-to-pypi.yaml on treeverse/langchain-lakefs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langchain_lakefs-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_lakefs-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b49aaae5ab0402622d51e30445744dff3df0710527c135b92bea82e9072e27ce
MD5 f2fc2ba5e8766d92d3bb781785580b1d
BLAKE2b-256 ffbbdffb24ee1e0262e88b81b27cf8f7fd00059772790925d23b952c5902a862

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_lakefs-0.1.1-py3-none-any.whl:

Publisher: publish-to-pypi.yaml on treeverse/langchain-lakefs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page