Pre release of Vectrs, a decentralized and distributed vector database network

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

readme

Vectrs - Decentralized & Distributed Vector Database

Overview

Vectrs is a decentralized & distributed vector database designed for efficient storage and retrieval of vector embeddings. By utilizing commodity hardware and scaling horizontally, Vectrs offers a cost-effective solution compared to traditional centralized databases. It leverages a distributed hash table (DHT) for decentralized data management, ensuring scalability and fault tolerance.

Features

Distributed Storage: Data is distributed across multiple nodes for scalability and fault tolerance.
Cost-Effective: Utilizes commodity hardware to reduce costs.
Horizontal Scalability: Easily add more nodes to handle increased data load.
Efficient Vector Operations: Optimized for storing and querying vector embeddings.
OpenAI Integration: Supports storing and retrieving vector embeddings generated by OpenAI models.

Installation

You can install Vectrs from PyPI using pip:


pip install vectrs

Usage

Initializing a Vectrs Node

To initialize a Vectrs node, import the KademliaNode class and start the node:

import asyncio
from vectrs.network import KademliaNode
from vectrs.database import VectorDBManager

async def start_node():
    db_manager = VectorDBManager()
    node = KademliaNode(host='127.0.0.1', port=8468)
    node.set_local_db_manager(db_manager)
    await node.start()
    return node

if __name__ == "__main__":
    asyncio.run(start_node())

Adding and Querying Vectors

Adding Vectors

You can add vectors to the database by using the add\_vector method:

import numpy as np

async def add_vectors(node, db_id):
    vectors = {
        "vec1": np.random.rand(1024).astype(np.float32),
        "vec2": np.random.rand(1024).astype(np.float32),
    }
    metadata = "Example metadata"
    for vector_id, vector in vectors.items():
        await node.add_vector(db_id, vector_id, vector, metadata)
        print(f"Added vector with ID: {vector_id} and metadata: {metadata}")

if __name__ == "__main__":
    node = asyncio.run(start_node())
    db_id = node.local_db_manager.create_database(dim=1024)
    print(f"Created database with ID: {db_id}")
    asyncio.run(add_vectors(node, db_id))

Querying Vectors

You can query vectors from the database using the query\_vector method:

async def query_vector(node, db_id, vector_id):
    vector = await node.query_vector(db_id, vector_id)
    print(f"Retrieved vector: {vector}")

if __name__ == "__main__":
    node = asyncio.run(start_node())
    db_id = node.local_db_manager.create_database(dim=1024)
    print(f"Created database with ID: {db_id}")
    asyncio.run(query_vector(node, db_id, "vec1"))

Example with OpenAI Embeddings

You can store OpenAI-generated vector embeddings in Vectrs:

import openai
import numpy as np

openai.api_key = 'your_openai_api_key'

async def store_openai_embedding(node, db_id, text):
    response = openai.Embedding.create(input=[text], model="text-embedding-ada-002")
    vector = np.array(response['data'][0]['embedding'], dtype=np.float32)
    await node.add_vector(db_id, "openai_vec", vector, "OpenAI generated embedding")
    print("Stored OpenAI embedding")

if __name__ == "__main__":
    node = asyncio.run(start_node())
    db_id = node.local_db_manager.create_database(dim=1024)
    print(f"Created database with ID: {db_id}")
    asyncio.run(store_openai_embedding(node, db_id, "Example text for embedding"))

Retrieving Vector and Log Hashes

After adding a vector, you can retrieve its hash and log hash as follows:

async def add_vector_and_get_hashes(node, db_id):
    vector = np.random.rand(1024).astype(np.float32)
    vector_id = "vec1"
    metadata = "Example metadata"

    # Add the vector
    await node.add_vector(db_id, vector_id, vector, metadata)

    # Retrieve vector hash and log hash
    vector_hash = node.local_db_manager.get_vector_hash(db_id, vector_id)
    log_hash = node.local_db_manager.get_log_hash(db_id, vector_id)

    print(f"Added vector with ID: {vector_id} and metadata: {metadata}")
    print(f"Vector hash: {vector_hash}")
    print(f"Log hash: {log_hash}")

if __name__ == "__main__":
    node = asyncio.run(start_node())
    db_id = node.local_db_manager.create_database(dim=1024)
    print(f"Created database with ID: {db_id}")
    asyncio.run(add_vector_and_get_hashes(node, db_id))

API Reference

KademliaNode

Methods

start(): Starts the node and listens for connections.
stop(): Stops the node.
bootstrap(bootstrap\_host, bootstrap\_port): Bootstraps the node to an existing network.
add\_vector(db\_id, vector\_id, vector, metadata=None): Adds a vector to the database.
query\_vector(db\_id, vector\_id): Queries a vector from the database.
set\_local\_db\_manager(db\_manager): Sets the local database manager.
get\_value(key): Retrieves a value from the DHT by key.

VectorDBManager

Methods

create\_database(dim): Creates a new vector database with the specified dimensions and returns the database ID.
get\_database(db\_id): Retrieves a database by its ID.
get\_vector\_hash(db\_id, vector\_id): Retrieves the hash of a vector by its ID.
get\_log\_hash(db\_id, vector\_id): Retrieves the log hash of a vector by its ID.

Contribution

Contributions are welcome! Please open an issue or submit a pull request on GitHub.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Support

For support or inquiries, please contact sakib@paralex.tech

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.3.3

Nov 1, 2024

0.3.2

Nov 1, 2024

0.3.1

Oct 29, 2024

0.3.0

Oct 28, 2024

This version

0.1.0

Jun 9, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectrs-0.1.0.tar.gz (15.4 kB view details)

Uploaded Jun 9, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vectrs-0.1.0-py3-none-any.whl (15.6 kB view details)

Uploaded Jun 9, 2024 Python 3

File details

Details for the file vectrs-0.1.0.tar.gz.

File metadata

Download URL: vectrs-0.1.0.tar.gz
Upload date: Jun 9, 2024
Size: 15.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.8.5

File hashes

Hashes for vectrs-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`ab464dbb52e2bc832eb0361798fbcc422dc37984e7cd479d7e5a02763f311d29`
MD5	`a5e3c168fb0541cb85e957747aa6cece`
BLAKE2b-256	`b01bff220073d6075882e0512a919e741e8afe5712cfb199249f630bfa4c0e92`

See more details on using hashes here.

File details

Details for the file vectrs-0.1.0-py3-none-any.whl.

File metadata

Download URL: vectrs-0.1.0-py3-none-any.whl
Upload date: Jun 9, 2024
Size: 15.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.8.5

File hashes

Hashes for vectrs-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4d655dead1313f772bddfa0e7f683681c1ecc1ec1c5b8037f9b065284604ff1c`
MD5	`e1e50815caf2eb83cdc8286545faf8ee`
BLAKE2b-256	`076c64352e25f2fdc330bbed272352a39f5f86c14ab559648c1bd6a7d107ce64`

See more details on using hashes here.

vectrs 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

readme

Vectrs - Decentralized & Distributed Vector Database

Overview

Features

Installation

Usage

Initializing a Vectrs Node

Adding and Querying Vectors

Adding Vectors

Querying Vectors

Example with OpenAI Embeddings

Retrieving Vector and Log Hashes

API Reference

KademliaNode

Methods

VectorDBManager

Methods

Contribution

License

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes