A package for testing vector embeddings.
Project description
Vekta Testa
A test library for vector database search testing.
About
To architect systems that deliver advanced search capabilities, we need to be able to measure and understand search results. Only through testing can we determine which approach is appropriate for our data and problem space. This library evaluates how well searches locate an expected value from a vector store and graphs the result to help communicate our findings.
View our Jupyter Notebook for more details on this graph.
Vecta Testa Runner
The runner is the main component of the library. It defines how the tests are run and their properties.
from vekta_testa import runner
Here's an example of using the runner:
results = runner.run_vecta_tests(
[
runner.EmbeddingIndex(
index_name='openai',
search_function=lambda scenario: faiss_index_openai.similarity_search_with_score(scenario, k=900),
find_result=lambda values, target: find_id(values, target),
post_processor=lambda scenario, scored_values: post_processor(scenario, scored_values)
),
runner.EmbeddingIndex(
index_name='deepset/all-mpnet-base-v2',
search_function=lambda scenario: mpnet_index.similarity_search_with_score(scenario, k=900),
find_result=lambda values, target: find_id(values, target),
post_processor=lambda scenario, scored_values: post_processor(scenario, scored_values)
),
],
[
runner.Testcase(
case_id='1 - exact matching text',
scenario="Thane has an appealing 2 BHK flat for sale with various amenities. Situated in the excellent Swastik Alps township."
),
runner.Testcase(
case_id='2 - partial match',
scenario="Thane has an appealing 2 BHK flat for sale with various amenities. It's located in the exquisite township near the mountains."
),
runner.Testcase(
case_id='3 - using other words',
scenario='Offers a charming two bedroom apartment available, equipped with numerous features. Located in the prime Swastik Heights community.'
),
],
'284'
)
EmbeddingIndex
Class
This class defines a vector solution to be used for testing.
Usage
from vekta_testa import runner
runner.EmbeddingIndex(
index_name='openai',
search_function=lambda scenario: faiss_index_openai.similarity_search_with_score(scenario, k=900),
find_result=lambda values, target: find_id(values, target),
post_processor=lambda scenario, scored_values: post_processor(scenario, scored_values)
),
Attributes
index_name: str
The name of the index.
search_function: Callable[[str], List[Tuple[Any, float]]]
The function used to perform a search.
post_processor: Optional[Callable[[str, List[Tuple[Any, float]]], List[Tuple[Any, float]]]] = None
An optional function to post-process the results from the search function.
find_result: Optional[Callable[[List[Tuple[Any, float]], str], int]] = None
An optional function to find a result based on a key within the results.
Testcase
Class
This class defines the structure for a test case.
Usage
from vekta_testa import runner
runner.Testcase(
case_id='1-exact',
scenario="Thane has an appealing 2 BHK flat for sale with various amenities. Situated in the excellent Swastik Alps township."
)
Attributes
case_id: str
The ID of the test case.
scenario: str
The scenario to test. Usually a query to search for.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file vecta-testa-0.1.1.tar.gz
.
File metadata
- Download URL: vecta-testa-0.1.1.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7fd0b507122446c07e7a299a6038a2b97130c7eb3249a3b15ff0cd6d783b3f88 |
|
MD5 | eb0f2cc248525c204b4e855cbe73685a |
|
BLAKE2b-256 | 66af73743e0239cccf94b3e2c3d373f737bd7531c13883960cc96bcc45926e14 |
File details
Details for the file vecta_testa-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: vecta_testa-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0379bcaa4216f197ad68b9f409090b7e21cdb70c1465828340278e81277f10ac |
|
MD5 | cd7bd3eca3ee56fae3bba3021208d140 |
|
BLAKE2b-256 | cae990b015cb07dc53a16ea38c7e979ed763366006a3ae2c62f8c77b0a5651c7 |