Skip to main content

Framework to procedurally query vector stores

Project description

Vecworks

Vecworks is an open-source framework build on top of Pypeworks to procedurally query vector stores in Python. It offers the following features:

  • Standardized access to various vectorization platforms, like OpenAI's, SBERT and fastText
  • Mixing and matching of vectorization procedures, reducable to a single output using ensemblers
  • Remote serving of custom vectorization procedures using the built-in server

Install

Vecworks is available through the PyPI repository and can be installed using pip:

pip install vecworks

Quickstart

Vecworks' key concept is that of the Retriever. A retriever is a specialised Pypeworks node that vectorizes inputs, allowing to cross-reference these inputs with data in a vector store. As nodes retrievers may be embedded in Pypeworks pipeworks, enabling various applications, including semantic text matching, document classification, and RAG (when combined with Langworks).

Assuming a vector store has been set-up on a PostgreSQL-database, a retriever may be instantiated as follows:

import vecworks

from vecworks.retrievers.pgvector import (
    pgvectorRetriever
)

from vecworks.vectorizers.sbert import (
    sbertVectorizer
)

match = pgvectorRetriever(

    url   = "postgresql://127.0.0.1:5432/rag-mini-wikipedia",
    # Populated using https://huggingface.co/datasets/rag-datasets/rag-mini-wikipedia

    authenticator = vecworks.auth.UsernameCredentials("username", "password"),

    table = '"text-corpus"',

    index = [

        vecworks.Index(

            name         = "passage-e5-ml-large-q",
            # Column derived from 'passage', populated with vectorized contents of 'passage'.

            bind         = "input",

            distance     = vecworks.DISTANCES.cosine,
            max_distance = 0.2,
            top_k        = 5,

            vectorizer   = sbertVectorizer.create_from_string(

                "intfloat/multilingual-e5-large",

                prompt_format = "query: "
                normalize     = True

            ),

            density      = vecworks.DENSITY.dense

        ),

    ],

    return_columns = ["passage", "id"]

    top_k       = 3

) 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vecworks-0.1.1.tar.gz (36.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vecworks-0.1.1-py3-none-any.whl (48.3 kB view details)

Uploaded Python 3

File details

Details for the file vecworks-0.1.1.tar.gz.

File metadata

  • Download URL: vecworks-0.1.1.tar.gz
  • Upload date:
  • Size: 36.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for vecworks-0.1.1.tar.gz
Algorithm Hash digest
SHA256 de976c86751e0e62baebf2c3a5af5589926da7f9a004790101cc39f872e226a6
MD5 c93fae42d1d19eb32b55d76092a09b46
BLAKE2b-256 e0affa622baed372383c2616603d4e337684060ccad1f59191faedfff4ec225a

See more details on using hashes here.

File details

Details for the file vecworks-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: vecworks-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 48.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for vecworks-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bc6cf1107082873f02825d73cc8caba9b59691559dfe03bcebc28497ffcc6ce4
MD5 833b1d4ff3a13f7fc22564ce0246b783
BLAKE2b-256 b293e1fe62b1476c1f8d5f758b5be42c30d6f033bac01f76265446820bf6d945

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page