Use this Python client to embed documents with VectorFlow, an open source, high throughput, production ready vector embedding pipeline.
Project description
VectorFlow Python Client
Use this Python client to embed documents with VectorFlow and check on the status of those embeddings.
How to Use
The client has 2 methods for uploading documents to embed and 2 for checking statuses, listed below. All four methods return a python response
object from the python requests
library. You must parse the response using the .json()
method.
Initialize
from vectorflow_client.vectorflow import Vectorflow
vectorflow = Vectorflow()
vectorflow.embedding_api_key = "YOUR_OPEN_AI_KEY"
Embed Multiple Files
paths = ['./src/api/tests/fixtures/test_pdf.pdf', './src/api/tests/fixtures/test_medium_text.txt']
response = vectorflow.upload(paths)
Embed a Single File
filepath = './src/api/tests/fixtures/test_medium_text.txt'
response = vectorflow.embed(filepath)
Get Statuses for Multiple Jobs
response = vectorflow.get_job_statuses(jobs_ids)
Get Status for Single Job
response = vectorflow.get_job_status(job_id)
Notes on Default Setup
By default, this will set up vectorflow to embed files locally and upload them to a local instance of qdrant. It assumes you follow the default configuration in the VectorFlow repository's setup.sh
which runs a collection of docker images locally using docker compose that will embed the documents with Open AI's ADA model and upload it to a local qdrant instance.
For more granular control over the chunking, embedding and vector DB configurations, override default values on the Vectorflow
class or on its embeddings_metadata
and vector_db_metadata
fields. For example:
from vectorflow_client.vectorflow import Vectorflow
from vectorflow_client.embeddings_type import EmbeddingsType
from vectorflow_client.vector_db_type import VectorDBType
vectorflow = Vectorflow()
# use open source sentence transformer model
vectorflow.embeddings_metadata.hugging_face_model_name = "thenlper/gte-base"
vectorflow.embeddings_metadata.embeddings_type = EmbeddingsType.HUGGING_FACE
# use Pinecone
vectorflow.vector_db_metadata.vector_db_type = VectorDBType.PINECONE
vectorflow.vector_db_metadata.environment = "us-east-1-aws"
vectorflow.vector_db_metadata.index_name = "test"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for vectorflow_client-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 302a0a76abaa3189e423811ef182b4ca64c59aa12995ffce91224f9bf5aa1f69 |
|
MD5 | d058a7c3585f1ccbd7606776f5cfd3d4 |
|
BLAKE2b-256 | 65e6c073f656343ff07cd777e7d61c45f7fc811f67fa3f076a96b8a328f2ae5b |