Hyperparameter optimisation for your LLM Emebddings.

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

💛 Vectorboard - `alpha 0.0.1`

Embeddings Optimization and Eval Framework for RAG/LLM Applications

Find the best hyperparameters for Embedding your data in RAG Pipeline

vectorboard banner image

TL;DR

Instal vectorboard

pip install vectorboard

Create a grid search of parameters you want to experiment. For example:

param_grid = {
    "chunk_size": [50, 300, 500],
    "vector_store": [FAISS],
    "embeddings": [OpenAIEmbeddings(), HuggingFaceEmbeddings()],
}

Run the search using a GridSearch() (more search types upcoming).

from vectorboard.search import GridSearch

# Create a GridSearch with the chain you'd like to try.
grid_search = GridSearch(chain=RetrievalQA)

# Use a document loader
grid_search.create_experiments(loader, param_grid=param_grid)
grid_search.run(eval_queries=eval_queries)
grid_search.results()

Step by step overview of the example

Import GridSearch() from vectorboard.search

from vectorboard.search import GridSearch

Create a dict with parameters and steps you want to search over.

param_grid = {
    "chunk_size": [50, 300, 500],
    "vector_store": [FAISS],
    "embeddings": [OpenAIEmbeddings(), HuggingFaceEmbeddings()],
}

If the parameter is not one of the simple types (int, str, ...), you need to import the Class. E.g. to try different Embedding algorithms, first import them (we use langchain for simplicity):

from langchain.embeddings import OpenAIEmbeddings, HuggingFaceEmbeddings

Initilize a Vectorboard object with the chain you want to run your experiment. Currently supporting RetrivalQA. More chains and custom chains are on the roadmap.

search = GridSearch(chain=RetrievalQA)

Import a loader relevant to your data and include it as a parameter to .create_experiments().

from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader("recycling.pdf") # For example
search.create_experiments(param_grid=param_grid, loader=loader)

If you already loaded your data or have it available, use:

search.create_experiments(param_grid=param_grid, documents=YOUR_DOCS)

Define eval queries and run the experiments:

eval_queries = [
    "what percentage of waste is recyvled into materials in 2022?",
    # ...
]
search.run(eval_queries=eval_queries)

Finally, view the results in a Gradio app using .results() method. To get a publicly available link to share with your team, set the share=True paramater.

search.results(share=True)

Overview and Core concepts

RAG (Retreival Augmentation Generation) is great but it has a huge emphasis on perfect embeddings.

But it has challenges. It's hard to find the right

Current status

Built on top of 🦜⛓️Langchain
Using Gradio for the final result page (with shareable links)

Currently supported steps and parameters

Embeddings
Text and Document transformers
Vector Databases

Roadmap

Support more types of Search.
Support more chains. LLMChain and custom chains in progress.
Add async support to run Experiments() in parallel.
TS/JS support.
Add Eval tools and metrics.

Have a special feature request? Send your feedback/suggestion on our Discord community:

Project details

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.0.1

Sep 27, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectorboard-0.0.1.tar.gz (9.3 kB view hashes)

Uploaded Sep 27, 2023 Source

Built Distribution

vectorboard-0.0.1-py3-none-any.whl (10.0 kB view hashes)

Uploaded Sep 27, 2023 Python 3

Hashes for vectorboard-0.0.1.tar.gz

Hashes for vectorboard-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`4f129d77e36dadf9d62f871b02d08e73f5121b2a785b4d1da66b54f3324920f1`
MD5	`fee7f3fe7aa38d09d071503aa8ac309e`
BLAKE2b-256	`b11392016e563e7331d3e76fac965e78f1fc1d7d9e2f47d2aa94f88246614847`

Hashes for vectorboard-0.0.1-py3-none-any.whl

Hashes for vectorboard-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b76f75272c53fd5971d102721ca07231d3c0c10b4634b01b42e1abdbab6591fb`
MD5	`21223e91aa117d7b95fbd2008581a78f`
BLAKE2b-256	`6fb613c9c31a95aecfc31de9df48121de019df26f44668f5614dfecf65b1f657`

vectorboard 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

💛 Vectorboard - `alpha 0.0.1`

Embeddings Optimization and Eval Framework for RAG/LLM Applications

TL;DR

Step by step overview of the example

Overview and Core concepts

Current status

Currently supported steps and parameters

Roadmap

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

vectorboard 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

💛 Vectorboard - alpha 0.0.1

Embeddings Optimization and Eval Framework for RAG/LLM Applications

TL;DR

Step by step overview of the example

Overview and Core concepts

Current status

Currently supported steps and parameters

Roadmap

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

💛 Vectorboard - `alpha 0.0.1`