Skip to main content

Encrypted Vector Database for Secure and Fast ANN Searches

Project description

VectorX - Encrypted Vector Database

VectorX is an encrypted vector database designed for maximum security and speed. Utilizing client-side encryption with private keys, VectorX ensures data confidentiality while enabling rapid Approximate Nearest Neighbor (ANN) searches within encrypted datasets. Leveraging a proprietary algorithm, VectorX provides unparalleled performance and security for applications requiring robust vector search capabilities in an encrypted environment.

Key Features

  • Client-side Encryption: Vectors are encrypted using private keys before being sent to the server
  • Fast ANN Searches: Efficient similarity searches on encrypted vector data
  • Multiple Distance Metrics: Support for cosine, L2, and inner product distance metrics
  • Metadata Support: Attach and search with metadata and filters
  • High Performance: Optimized for speed and efficiency with encrypted data
  • Hybrid Search: Combine dense and sparse vector search for improved retrieval quality

Installation

pip install vecx

Quick Start

":] \

from vecx.vectorx import VectorX

# Initialize client with your API token
vx = VectorX(token="your-token-here")

# Generate a secure encryption key
encryption_key = vx.generate_key()


# Create a new index
vx.create_index(
    name="my_index",
    dimension=768,  # Your vector dimension
    key=encryption_key,  # Encryption key
    space_type="cosine"  # Distance metric (cosine, l2, ip)
)

# Get index reference
index = vx.get_index(name="my_index", key=encryption_key)

# Insert vectors
index.upsert([
    {
        "id": "doc1",
        "vector": [0.1, 0.2, 0.3, ...],  # Your vector data
        "meta": {"text": "Example document"}
        "filter":{"category": "reference"} # Optional filter
    }
])

# Query similar vectors
results = index.query(
    vector=[0.2, 0.3, 0.4, ...],  # Query vector
    top_k=10,
    filter={"category": {"eq":"reference"}}  # Optional filter
)

# Process results
for item in results:
    print(f"ID: {item['id']}, Similarity: {item['similarity']}")
    print(f"Metadata: {item['meta']}")

Basic Usage

Initializing the Client

from vecx.vectorx import VectorX

# Production with specific region
vx = VectorX(token="your-token-here")

Managing Indexes

# List all indexes
indexes = vx.list_indexes()

# Create an index with custom parameters
vx.create_index(
    name="my_custom_index",
    dimension=768,
    key=encryption_key,
    space_type="cosine",
    M=16,             # Graph connectivity parameter (default = 16)
    ef_con=128,       # Construction-time parameter (default = 128)
    use_fp16=True     # Use half-precision for storage optimization (default = True)
)

# Delete an index
vx.delete_index("my_custom_index")

Working with Vectors

# Get index reference
index = vx.get_index(name="my_custom_index", key=encryption_key)

# Insert multiple vectors in a batch
index.upsert([
    {
        "id": "vec1",
        "vector": [...],  # Your vector
        "meta": {"title": "First document", "tags": ["important"]}
    },
    {
        "id": "vec2",
        "vector": [...],  # Another vector
        "meta": {"title": "second document", "tags": ["important"]}
        "filter": {"visibility": "public"}  # Optional filter values
    }
])

# Query with custom parameters
results = index.query(
    vector=[...],      # Query vector
    top_k=5,           # Number of results to return
    filter= {"visibility":{"eq":"public"}},   # Filter for matching
    ef=128,            # Runtime parameter for search quality
    include_vectors=True  # Include vector data in results
)

# Delete vectors
index.delete_vector("vec1")
index.delete_with_filter({"visibility":{"eq":"public"}})

# Get a specific vector
vector = index.get_vector("vec1")

Hybrid Search

VectorX now supports hybrid search, combining the strengths of both dense and sparse vector search for enhanced retrieval quality.

Creating a Hybrid Index

# Create a hybrid index
vx.create_hybrid_index(
    name="my_hybrid_index",
    dimension=768,      # Dimension for dense vectors
    vocab_size=30522,   # Vocabulary size for sparse vectors (default = 30522)
    key=encryption_key,
    space_type="cosine" # Distance metric for dense vectors
)

# Get reference to the hybrid index
hybrid_index = vx.get_hybrid_index(name="my_hybrid_index", key=encryption_key)

Working with Hybrid Vectors

# Insert vectors with both dense and sparse components
hybrid_index.upsert([
    {
        "id": "doc1",
        "vector": [0.1, 0.2, ...],  # Dense vector component
        "sparse_vector": [          # Sparse vector component
            {"index": 5, "value": 0.5},
            {"index": 10, "value": 1.0},
            {"index": 15, "value": 1.5}
        ],
        "meta": {"title": "Hybrid document", "tags": ["important"]}
    }
])

# Perform hybrid search
results = hybrid_index.hybrid_search(
    dense_vector=[0.2, 0.3, ...],   # Dense query vector
    sparse_vector=[                 # Sparse query vector
        {"index": 5, "value": 0.4},
        {"index": 20, "value": 0.8}
    ],
    dense_top_k=10,      # Number of results from dense search
    sparse_top_k=10,     # Number of results from sparse search
    final_top_k=5,       # Number of final results after fusion
    k_rrf=1.0,           # RRF constant for ranking fusion
    include_vectors=True, # Include vectors in results
    filter={"tags": {"eq": "important"}}  # Optional filter
)

# Process hybrid search results
for item in results:
    print(f"ID: {item['id']}, RRF Score: {item['rrf_score']}")
    print(f"Dense Rank: {item['dense_rank']}, Sparse Rank: {item['sparse_rank']}")
    print(f"Metadata: {item['meta']}")

# Delete a vector from both indices
hybrid_index.delete_vector("doc1")

# Delete hybrid index when no longer needed
vx.delete_hybrid_index("my_hybrid_index")

API Reference

VectorX Class

  • __init__(token=None): Initialize with optional API token
  • set_token(token): Set API token
  • set_base_url(base_url): Set custom API endpoint
  • generate_key(): Generate a secure encryption key
  • create_index(name, dimension, key, space_type, ...): Create a new index
  • create_hybrid_index(name, dimension, vocab_size, key, ...): Create a new hybrid index
  • list_indexes(): List all indexes
  • delete_index(name): Delete an index
  • delete_hybrid_index(name): Delete a hybrid index
  • get_index(name, key): Get reference to an index
  • get_hybrid_index(name, key): Get reference to a hybrid index

Index Class

  • upsert(input_array): Insert or update vectors
  • query(vector, top_k, filter, ef, include_vectors): Search for similar vectors
  • delete_vector(id): Delete a vector by ID
  • delete_with_filter(filter): Delete vectors matching a filter
  • get_vector(id): Get a specific vector
  • describe(): Get index statistics and info

HybridIndex Class

  • upsert(input_array): Insert or update hybrid vectors
  • hybrid_search(dense_vector, sparse_vector, ...): Perform hybrid search
  • delete_vector(id): Delete a vector by ID
  • delete_with_filter(filter): Delete vectors matching a filter
  • describe(): Get hybrid index statistics and info

Security Considerations

  • Key Management: Store your encryption key securely. Loss of the key will result in permanent data loss.
  • Client-Side Encryption: All sensitive data is encrypted before transmission.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vecx-0.32.9b1.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vecx-0.32.9b1-py3-none-any.whl (1.5 MB view details)

Uploaded Python 3

File details

Details for the file vecx-0.32.9b1.tar.gz.

File metadata

  • Download URL: vecx-0.32.9b1.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for vecx-0.32.9b1.tar.gz
Algorithm Hash digest
SHA256 a7723f402a45686e2332b9f458ddfd8eddc5a05cb09744a937c8b6032f208dde
MD5 bc65834b25be2ad9f694e60144247a97
BLAKE2b-256 8205025d16cab4f6bd7c826c8ed195f7227ef2854bd69f2d837a20c15454e85b

See more details on using hashes here.

File details

Details for the file vecx-0.32.9b1-py3-none-any.whl.

File metadata

  • Download URL: vecx-0.32.9b1-py3-none-any.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for vecx-0.32.9b1-py3-none-any.whl
Algorithm Hash digest
SHA256 aca3b29157e8076f2ddff5a8ce578ec0504a64147c5408f90d833c50d1a0acbf
MD5 3334d261f9b754f4188694cb91d8e669
BLAKE2b-256 412377be68d96c801a5cc9dad4183a4c675c4e6a73ebbc83568c7cbe135b080f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page