Skip to main content

langchain embeddings wrapper to persist embeddings for re-use later

Project description

langchain-s3-cached-embeddings

Proxies any langchain Embeddings class such as OpenAIEmbeddings, GoogleGenerativeAIEmbeddings, persisting all generated embeddings to S3. This allows subsequent calls to optionally leverage the cached embeddings, avoiding additional and unecessary cost of re-embedding.

Install

pip install langchain-s3-cached-embeddings

Usage

from langchain_s3_text_loaders import S3DirectoryLoader

   embeddings = S3EmbeddingsConduit(
        embeddings=OpenAIEmbeddings(model=model), # required
        bucket="my-embeddings-bucket", # required
        prefix="my-optional-prefix"
    )

Advanced Usage

   embeddings = S3EmbeddingsConduit(
        embeddings=OpenAIEmbeddings(model=model), # required
        bucket="my-embeddings-bucket", # required
        prefix="my-optional-prefix"
        filenaming_function: Optional[Callable[[str, int], str]] = None,
        cache_behavior = CacheBehavior.NO_CACHE):

Usage Options

  • embeddings - (required) any class implementing langchain_core.embeddings.Embeddings
  • bucket - (required) the s3 bucket name
  • prefix - (required) the s3 key name
  • filenaming_function - (optional) redeives two arguments, 1. the file contents (str), 2. the index (int) e.g. 9 for the 10the document and returns the filename ()str)
  • cache_behavior - (optional)
    • CacheBehavior.NO_CACHE - do not use cached embeddings, instead embed using the embeddings class' standard embed_documents(...) method
    • CacheBehavior.ONLY_CACHE - use cached embeddings. if the embeddings are no present, it raises an exception

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_s3_cached_embeddings-0.8.3.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file langchain_s3_cached_embeddings-0.8.3.tar.gz.

File metadata

File hashes

Hashes for langchain_s3_cached_embeddings-0.8.3.tar.gz
Algorithm Hash digest
SHA256 f32dfedcbfd8e22ab77e1a507ba46abf3c154a7df4baae68e836c2e262e71bf5
MD5 51f292ff2aefd4447381bea408271784
BLAKE2b-256 ba994843654aad95876efdd7ee41a7478dccc807573e11d1caa3ca5ddf7b39aa

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_s3_cached_embeddings-0.8.3.tar.gz:

Publisher: python-publish.yml on cdimascio/langchain-s3-cached-embeddings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langchain_s3_cached_embeddings-0.8.3-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_s3_cached_embeddings-0.8.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e829b91dfa95cd01af3b6f4a246dd43c5496ae0646e7a750b6a55301a802382c
MD5 46c15128252464dfcadc126f1e6878fc
BLAKE2b-256 2c396d648c10e89e0c8849352e962ce02c725f7b75210f4a321ae0f48621bfe8

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_s3_cached_embeddings-0.8.3-py3-none-any.whl:

Publisher: python-publish.yml on cdimascio/langchain-s3-cached-embeddings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page