Skip to main content

A package for testing vector embeddings.

Project description

Vekta Testa

A test library for vector database search testing.

About

To architect systems that deliver advanced search capabilities, we need to be able to measure and understand search results. Only through testing can we determine which approach is appropriate for our data and problem space. This library evaluates how well searches locate an expected value from a vector store and graphs the result to help communicate our findings.

Screenshot 2023-08-20 at 6 56 31 PM View our Jupyter Notebook for more details on this graph.

Vecta Testa Runner

The runner is the main component of the library. It defines how the tests are run and their properties.

from vekta_testa import runner

Here's an example of using the runner:

results = runner.run_vecta_tests(
  [
      runner.EmbeddingIndex(
          index_name='openai',
          search_function=lambda scenario: faiss_index_openai.similarity_search_with_score(scenario, k=900),
          find_result=lambda values, target: find_id(values, target),
          post_processor=lambda scenario, scored_values: post_processor(scenario, scored_values)
      ),
      runner.EmbeddingIndex(
          index_name='deepset/all-mpnet-base-v2',
          search_function=lambda scenario: mpnet_index.similarity_search_with_score(scenario, k=900),
          find_result=lambda values, target: find_id(values, target),
          post_processor=lambda scenario, scored_values: post_processor(scenario, scored_values)
      ),
  ],
  [
      runner.Testcase(
          case_id='1 - exact matching text',
          scenario="Thane has an appealing 2 BHK flat for sale with various amenities. Situated in the excellent Swastik Alps township."
      ),
      runner.Testcase(
          case_id='2 - partial match',
          scenario="Thane has an appealing 2 BHK flat for sale with various amenities. It's located in the exquisite township near the mountains."
      ),
      runner.Testcase(
          case_id='3 -  using other words',
          scenario='Offers a charming two bedroom apartment available, equipped with numerous features. Located in the prime Swastik Heights community.'
      ),
  ],
  '284'
)

EmbeddingIndex Class

This class defines a vector solution to be used for testing.

Usage

from vekta_testa import runner
runner.EmbeddingIndex(
    index_name='openai',
    search_function=lambda scenario: faiss_index_openai.similarity_search_with_score(scenario, k=900),
    find_result=lambda values, target: find_id(values, target),
    post_processor=lambda scenario, scored_values: post_processor(scenario, scored_values)
),

Attributes

index_name: str

The name of the index.


search_function: Callable[[str], List[Tuple[Any, float]]]

The function used to perform a search.


post_processor: Optional[Callable[[str, List[Tuple[Any, float]]], List[Tuple[Any, float]]]] = None

An optional function to post-process the results from the search function.


find_result: Optional[Callable[[List[Tuple[Any, float]], str], int]] = None

An optional function to find a result based on a key within the results.


Testcase Class

This class defines the structure for a test case.

Usage

from vekta_testa import runner

runner.Testcase(
    case_id='1-exact',
    scenario="Thane has an appealing 2 BHK flat for sale with various amenities. Situated in the excellent Swastik Alps township."
)

Attributes

case_id: str

The ID of the test case.


scenario: str

The scenario to test. Usually a query to search for.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vecta-testa-0.1.1.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

vecta_testa-0.1.1-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file vecta-testa-0.1.1.tar.gz.

File metadata

  • Download URL: vecta-testa-0.1.1.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for vecta-testa-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7fd0b507122446c07e7a299a6038a2b97130c7eb3249a3b15ff0cd6d783b3f88
MD5 eb0f2cc248525c204b4e855cbe73685a
BLAKE2b-256 66af73743e0239cccf94b3e2c3d373f737bd7531c13883960cc96bcc45926e14

See more details on using hashes here.

File details

Details for the file vecta_testa-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: vecta_testa-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for vecta_testa-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0379bcaa4216f197ad68b9f409090b7e21cdb70c1465828340278e81277f10ac
MD5 cd7bd3eca3ee56fae3bba3021208d140
BLAKE2b-256 cae990b015cb07dc53a16ea38c7e979ed763366006a3ae2c62f8c77b0a5651c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page