Skip to main content

VectorizeDB is a database for vectorized data and metadata, allowing for fast similarity search and retrieval.

Project description

Overview

VectorizeDB is a Python package designed for the efficient storage and retrieval of high-dimensional vectors. It’s particularly useful in applications like machine learning and information retrieval. The package utilizes hnswlib for fast approximate nearest neighbor searches and LMDB for scalable and reliable storage.

Installation

To install VectorizeDB, ensure you have Python 3.10 or higher. It can be installed via pip:

pip install vectorizedb

Usage

Initialization

from vectorizedb import Database

# Initialize a new database
db = Database(path="path/to/db", dim=128, readonly=False, similarity="cosine")

Adding Data

import numpy as np

# Add a vector with an associated key
db.add(key="sample_key", vector=np.random.rand(128))

# Add a vector with metadata
db.add(key="another_key", vector=np.random.rand(128), metadata={"info": "sample metadata"})

# Another way to add data
db["yet_another_key"] = (np.random.rand(128), {"info": "sample metadata"})

Retrieving Data

# Retrieve vector and metadata by key
vector, metadata = db["sample_key"]

# Check if a key exists in the database
exists = "sample_key" in db

Searching

# Search for nearest neighbors of a vector
results = db.search(vector=np.random.rand(128), k=5)
for key, vector, distance, metadata in results:
    print(key, distance, metadata)

Iterating Through Data

# Iterate through all keys, vectors and metadata in the database
for key, vector, metadata in db:
    print(key, metadata)

Updating Data

# Update a vector in the database
db.update_vector("sample_key", np.random.rand(128))

# Update metadata
db.update_metadata("sample_key", {"info": "updated metadata"})

Deleting Data

# Delete a vector from the database by key
del db["sample_key"]

Database Length

# Get the number of entries in the database
length = len(db)

License

VectorizeDB is released under the Apache License. For more details, see the LICENSE file included in the package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectorizedb-1.0.0.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

vectorizedb-1.0.0-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file vectorizedb-1.0.0.tar.gz.

File metadata

  • Download URL: vectorizedb-1.0.0.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.1 Linux/6.2.0-1019-azure

File hashes

Hashes for vectorizedb-1.0.0.tar.gz
Algorithm Hash digest
SHA256 58871f06cf303b078a0c0dee71da401243e44ffb81e1dcfadce13c49ce9a029a
MD5 2cb2127a2d55aa8878cbf2ecab3cdfed
BLAKE2b-256 ed3eb760e77e62001a3eac507aded33a9339978aa65b52d705cc8a0a2e26a640

See more details on using hashes here.

File details

Details for the file vectorizedb-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: vectorizedb-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.1 Linux/6.2.0-1019-azure

File hashes

Hashes for vectorizedb-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dbadd21a5951cfc8a8e991a3b147b03238b3f95bb500930262b55b8989d3a880
MD5 2a8de7641053df8045e843311ffbe7e3
BLAKE2b-256 d098c452a4ec786010620a3a6f3fcad4d97e67b83c337c73436840ec80eb0605

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page