Skip to main content

Hyperparameter optimisation for your LLM Emebddings.

Project description

💛 Vectorboard - alpha 0.0.1

Discord Follow Twitter Follow Twitter Follow

Embeddings Optimization and Eval Framework for RAG/LLM Applications

Find the best hyperparameters for Embedding your data in RAG Pipeline

vectorboard banner image

TL;DR

  1. Instal vectorboard
pip install vectorboard
  1. Create a grid search of parameters you want to experiment. For example:
param_grid = {
    "chunk_size": [50, 300, 500],
    "vector_store": [FAISS],
    "embeddings": [OpenAIEmbeddings(), HuggingFaceEmbeddings()],
}
  1. Run the search using a GridSearch() (more search types upcoming).
from vectorboard.search import GridSearch

# Create a GridSearch with the chain you'd like to try.
grid_search = GridSearch(chain=RetrievalQA)

# Use a document loader
grid_search.create_experiments(loader, param_grid=param_grid)
grid_search.run(eval_queries=eval_queries)
grid_search.results()

Step by step overview of the example

Import GridSearch() from vectorboard.search

from vectorboard.search import GridSearch

Create a dict with parameters and steps you want to search over.

param_grid = {
    "chunk_size": [50, 300, 500],
    "vector_store": [FAISS],
    "embeddings": [OpenAIEmbeddings(), HuggingFaceEmbeddings()],
}

If the parameter is not one of the simple types (int, str, ...), you need to import the Class. E.g. to try different Embedding algorithms, first import them (we use langchain for simplicity):

from langchain.embeddings import OpenAIEmbeddings, HuggingFaceEmbeddings

Initilize a Vectorboard object with the chain you want to run your experiment. Currently supporting RetrivalQA. More chains and custom chains are on the roadmap.

search = GridSearch(chain=RetrievalQA)

Import a loader relevant to your data and include it as a parameter to .create_experiments().

from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader("recycling.pdf") # For example
search.create_experiments(param_grid=param_grid, loader=loader)

If you already loaded your data or have it available, use:

search.create_experiments(param_grid=param_grid, documents=YOUR_DOCS)

Define eval queries and run the experiments:

eval_queries = [
    "what percentage of waste is recyvled into materials in 2022?",
    # ...
]
search.run(eval_queries=eval_queries)

Finally, view the results in a Gradio app using .results() method. To get a publicly available link to share with your team, set the share=True paramater.

search.results(share=True)

Overview and Core concepts

RAG (Retreival Augmentation Generation) is great but it has a huge emphasis on perfect embeddings.

But it has challenges. It's hard to find the right

Current status

  • Built on top of 🦜⛓️Langchain
  • Using Gradio for the final result page (with shareable links)

Currently supported steps and parameters

  1. Embeddings
  2. Text and Document transformers
  3. Vector Databases

Roadmap

  • Support more types of Search.
  • Support more chains. LLMChain and custom chains in progress.
  • Add async support to run Experiments() in parallel.
  • TS/JS support.
  • Add Eval tools and metrics.

Have a special feature request? Send your feedback/suggestion on our Discord community: Discord Follow

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectorboard-0.0.1.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

vectorboard-0.0.1-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file vectorboard-0.0.1.tar.gz.

File metadata

  • Download URL: vectorboard-0.0.1.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.3 Darwin/22.6.0

File hashes

Hashes for vectorboard-0.0.1.tar.gz
Algorithm Hash digest
SHA256 4f129d77e36dadf9d62f871b02d08e73f5121b2a785b4d1da66b54f3324920f1
MD5 fee7f3fe7aa38d09d071503aa8ac309e
BLAKE2b-256 b11392016e563e7331d3e76fac965e78f1fc1d7d9e2f47d2aa94f88246614847

See more details on using hashes here.

File details

Details for the file vectorboard-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: vectorboard-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.3 Darwin/22.6.0

File hashes

Hashes for vectorboard-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b76f75272c53fd5971d102721ca07231d3c0c10b4634b01b42e1abdbab6591fb
MD5 21223e91aa117d7b95fbd2008581a78f
BLAKE2b-256 6fb613c9c31a95aecfc31de9df48121de019df26f44668f5614dfecf65b1f657

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page