Encrypted Vector Database for Secure and Fast ANN Searches
Project description
VectorX - Encrypted Vector Database
VectorX is an encrypted vector database designed for maximum security and speed. Utilizing client-side encryption with private keys, VectorX ensures data confidentiality while enabling rapid Approximate Nearest Neighbor (ANN) searches within encrypted datasets. Leveraging a proprietary algorithm, VectorX provides unparalleled performance and security for applications requiring robust vector search capabilities in an encrypted environment.
Key Features
- Client-side Encryption: Vectors are encrypted using private keys before being sent to the server
- Fast ANN Searches: Efficient similarity searches on encrypted vector data
- Multiple Distance Metrics: Support for cosine, L2, and inner product distance metrics
- Metadata Support: Attach and search with metadata and filters
- High Performance: Optimized for speed and efficiency with encrypted data
- Hybrid Search: Combine dense and sparse vector search for improved retrieval quality
Installation
pip install vecx
Quick Start
":] \
from vecx.vectorx import VectorX
# Initialize client with your API token
vx = VectorX(token="your-token-here")
# Generate a secure encryption key
encryption_key = vx.generate_key()
# Create a new index
vx.create_index(
name="my_index",
dimension=768, # Your vector dimension
key=encryption_key, # Encryption key
space_type="cosine" # Distance metric (cosine, l2, ip)
)
# Get index reference
index = vx.get_index(name="my_index", key=encryption_key)
# Insert vectors
index.upsert([
{
"id": "doc1",
"vector": [0.1, 0.2, 0.3, ...], # Your vector data
"meta": {"text": "Example document"}
"filter":{"category": "reference"} # Optional filter
}
])
# Query similar vectors
results = index.query(
vector=[0.2, 0.3, 0.4, ...], # Query vector
top_k=10,
filter={"category": {"eq":"reference"}} # Optional filter
)
# Process results
for item in results:
print(f"ID: {item['id']}, Similarity: {item['similarity']}")
print(f"Metadata: {item['meta']}")
Basic Usage
Initializing the Client
from vecx.vectorx import VectorX
# Production with specific region
vx = VectorX(token="your-token-here")
Managing Indexes
# List all indexes
indexes = vx.list_indexes()
# Create an index with custom parameters
vx.create_index(
name="my_custom_index",
dimension=768,
key=encryption_key,
space_type="cosine",
M=16, # Graph connectivity parameter (default = 16)
ef_con=128, # Construction-time parameter (default = 128)
use_fp16=True # Use half-precision for storage optimization (default = True)
)
# Delete an index
vx.delete_index("my_custom_index")
Working with Vectors
# Get index reference
index = vx.get_index(name="my_custom_index", key=encryption_key)
# Insert multiple vectors in a batch
index.upsert([
{
"id": "vec1",
"vector": [...], # Your vector
"meta": {"title": "First document", "tags": ["important"]}
},
{
"id": "vec2",
"vector": [...], # Another vector
"meta": {"title": "second document", "tags": ["important"]}
"filter": {"visibility": "public"} # Optional filter values
}
])
# Query with custom parameters
results = index.query(
vector=[...], # Query vector
top_k=5, # Number of results to return
filter= {"visibility":{"eq":"public"}}, # Filter for matching
ef=128, # Runtime parameter for search quality
include_vectors=True # Include vector data in results
)
# Delete vectors
index.delete_vector("vec1")
index.delete_with_filter({"visibility":{"eq":"public"}})
# Get a specific vector
vector = index.get_vector("vec1")
Hybrid Search
VectorX now supports hybrid search, combining the strengths of both dense and sparse vector search for enhanced retrieval quality.
Creating a Hybrid Index
# Create a hybrid index
vx.create_hybrid_index(
name="my_hybrid_index",
dimension=768, # Dimension for dense vectors
vocab_size=30522, # Vocabulary size for sparse vectors (default = 30522)
key=encryption_key,
space_type="cosine" # Distance metric for dense vectors
)
# Get reference to the hybrid index
hybrid_index = vx.get_hybrid_index(name="my_hybrid_index", key=encryption_key)
Working with Hybrid Vectors
# Insert vectors with both dense and sparse components
hybrid_index.upsert([
{
"id": "doc1",
"vector": [0.1, 0.2, ...], # Dense vector component
"sparse_vector": [ # Sparse vector component
{"index": 5, "value": 0.5},
{"index": 10, "value": 1.0},
{"index": 15, "value": 1.5}
],
"meta": {"title": "Hybrid document", "tags": ["important"]}
}
])
# Perform hybrid search
results = hybrid_index.hybrid_search(
dense_vector=[0.2, 0.3, ...], # Dense query vector
sparse_vector=[ # Sparse query vector
{"index": 5, "value": 0.4},
{"index": 20, "value": 0.8}
],
dense_top_k=10, # Number of results from dense search
sparse_top_k=10, # Number of results from sparse search
final_top_k=5, # Number of final results after fusion
k_rrf=1.0, # RRF constant for ranking fusion
include_vectors=True, # Include vectors in results
filter={"tags": {"eq": "important"}} # Optional filter
)
# Process hybrid search results
for item in results:
print(f"ID: {item['id']}, RRF Score: {item['rrf_score']}")
print(f"Dense Rank: {item['dense_rank']}, Sparse Rank: {item['sparse_rank']}")
print(f"Metadata: {item['meta']}")
# Delete a vector from both indices
hybrid_index.delete_vector("doc1")
# Delete hybrid index when no longer needed
vx.delete_hybrid_index("my_hybrid_index")
API Reference
VectorX Class
__init__(token=None): Initialize with optional API tokenset_token(token): Set API tokenset_base_url(base_url): Set custom API endpointgenerate_key(): Generate a secure encryption keycreate_index(name, dimension, key, space_type, ...): Create a new indexcreate_hybrid_index(name, dimension, vocab_size, key, ...): Create a new hybrid indexlist_indexes(): List all indexesdelete_index(name): Delete an indexdelete_hybrid_index(name): Delete a hybrid indexget_index(name, key): Get reference to an indexget_hybrid_index(name, key): Get reference to a hybrid index
Index Class
upsert(input_array): Insert or update vectorsquery(vector, top_k, filter, ef, include_vectors): Search for similar vectorsdelete_vector(id): Delete a vector by IDdelete_with_filter(filter): Delete vectors matching a filterget_vector(id): Get a specific vectordescribe(): Get index statistics and info
HybridIndex Class
upsert(input_array): Insert or update hybrid vectorshybrid_search(dense_vector, sparse_vector, ...): Perform hybrid searchdelete_vector(id): Delete a vector by IDdelete_with_filter(filter): Delete vectors matching a filterdescribe(): Get hybrid index statistics and info
Security Considerations
- Key Management: Store your encryption key securely. Loss of the key will result in permanent data loss.
- Client-Side Encryption: All sensitive data is encrypted before transmission.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vecx-0.32.7b2.tar.gz.
File metadata
- Download URL: vecx-0.32.7b2.tar.gz
- Upload date:
- Size: 1.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a241c228e3805ad8c7050ebd3330f2918c189b902e83cd2821907ebe463f8a0b
|
|
| MD5 |
26996191b4f8d9ec980e80360388d7c4
|
|
| BLAKE2b-256 |
de56bd6f357d8f2e85502e391f77b2bddeafeba3376a6b4d2ab32d1e5d682603
|
File details
Details for the file vecx-0.32.7b2-py3-none-any.whl.
File metadata
- Download URL: vecx-0.32.7b2-py3-none-any.whl
- Upload date:
- Size: 1.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
683d445b8e7993611513129559e110ecf33d71344f3ad8b7a7434addefd62d8c
|
|
| MD5 |
bbe5966123291eea11c0b61b95d84dd6
|
|
| BLAKE2b-256 |
1ad450c796bb856d7a7c15d19399ec62dbc8b99060e7dc4c9561336d5a7b6a63
|