A high-performance vector database for AI applications
Project description
VelesDB Python
Python bindings for VelesDB - a high-performance vector database for AI applications.
Features
- Vector Similarity Search: HNSW index with SIMD-optimized distance calculations
- Multiple Distance Metrics: Cosine, Euclidean, Dot Product, Hamming, Jaccard
- Persistent Storage: Memory-mapped files for efficient disk I/O
- Metadata Support: Store and retrieve JSON payloads with vectors
- NumPy Integration: Native support for NumPy arrays
Installation
pip install velesdb
Quick Start
import velesdb
# Open or create a database
db = velesdb.Database("./my_vectors")
# Create a collection for 768-dimensional vectors (e.g., BERT embeddings)
collection = db.create_collection(
name="documents",
dimension=768,
metric="cosine" # Options: "cosine", "euclidean", "dot"
)
# Insert vectors with metadata
collection.upsert([
{
"id": 1,
"vector": [0.1, 0.2, ...], # 768-dim vector
"payload": {"title": "Introduction to AI", "category": "tech"}
},
{
"id": 2,
"vector": [0.3, 0.4, ...],
"payload": {"title": "Machine Learning Basics", "category": "tech"}
}
])
# Search for similar vectors
results = collection.search(
vector=[0.15, 0.25, ...], # Query vector
top_k=5
)
for result in results:
print(f"ID: {result['id']}, Score: {result['score']:.4f}")
print(f"Payload: {result['payload']}")
API Reference
Database
# Create/open database
db = velesdb.Database("./path/to/data")
# List collections
names = db.list_collections()
# Create collection
collection = db.create_collection("name", dimension=768, metric="cosine")
# Get existing collection
collection = db.get_collection("name")
# Delete collection
db.delete_collection("name")
Collection
# Get collection info
info = collection.info()
# {"name": "documents", "dimension": 768, "metric": "cosine", "point_count": 100}
# Insert/update vectors (with immediate flush)
collection.upsert([
{"id": 1, "vector": [...], "payload": {"key": "value"}}
])
# Bulk insert (optimized for high-throughput - 3-7x faster)
# Uses parallel HNSW insertion + single flush at the end
collection.upsert_bulk([
{"id": i, "vector": vectors[i].tolist()} for i in range(10000)
])
# Search
results = collection.search(vector=[...], top_k=10)
# Get specific points
points = collection.get([1, 2, 3])
# Delete points
collection.delete([1, 2, 3])
# Check if empty
is_empty = collection.is_empty()
# Flush to disk
collection.flush()
Bulk Loading Performance
For large-scale data import, use upsert_bulk() instead of upsert():
| Method | 10k vectors (768D) | Notes |
|---|---|---|
upsert() |
~47s | Flushes after each batch |
upsert_bulk() |
~3s | Single flush + parallel HNSW |
# Recommended for bulk import
import numpy as np
vectors = np.random.rand(10000, 768).astype('float32')
points = [{"id": i, "vector": v.tolist()} for i, v in enumerate(vectors)]
collection.upsert_bulk(points) # 7x faster than upsert()
Distance Metrics
| Metric | Description | Use Case |
|---|---|---|
cosine |
Cosine similarity (default) | Text embeddings, normalized vectors |
euclidean |
Euclidean (L2) distance | Image features, spatial data |
dot |
Dot product | When vectors are pre-normalized |
hamming |
Hamming distance | Binary vectors, fingerprints, hashes |
jaccard |
Jaccard similarity | Set similarity, tags, recommendations |
Performance
VelesDB is built in Rust with explicit SIMD optimizations:
| Operation | Time (768d) | Throughput |
|---|---|---|
| Cosine | ~76 ns | 13M ops/sec |
| Euclidean | ~47 ns | 21M ops/sec |
| Hamming | ~6 ns | 164M ops/sec |
Benchmark: VelesDB vs pgvector (HNSW)
Tested on clustered embeddings (768D) — realistic AI workloads:
| Dataset Size | VelesDB Recall | VelesDB P50 | pgvector P50 | Speedup |
|---|---|---|---|---|
| 1,000 | 100.0% | 0.5ms | 50ms | 100x |
| 10,000 | 99.0% | 2.5ms | 50ms | 20x |
| 100,000 | 97.8% | 4.3ms | 50ms | 12x |
- 12-100x faster than pgvector depending on dataset size
- 97-100% recall across all scales
- Sub-5ms latency even at 100k vectors
Requirements
- Python 3.9+
- No external dependencies (pure Rust engine)
- Optional: NumPy for array support
License
Elastic License 2.0 (ELv2)
See LICENSE for details.
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file velesdb-0.5.0.tar.gz.
File metadata
- Download URL: velesdb-0.5.0.tar.gz
- Upload date:
- Size: 170.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8db758d54751ddeba6b0b8c3179be0ea9703df113f05aeb6f9b980d28b01ac76
|
|
| MD5 |
e9a434154e4b66ac3b24e7af39bf507e
|
|
| BLAKE2b-256 |
2662062c00622c3b42952d2d80805edbec0738b8aa55c6bcbb39c79dd7016bdd
|
Provenance
The following attestation bundles were made for velesdb-0.5.0.tar.gz:
Publisher:
pypi-publish.yml on cyberlife-coder/VelesDB
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
velesdb-0.5.0.tar.gz -
Subject digest:
8db758d54751ddeba6b0b8c3179be0ea9703df113f05aeb6f9b980d28b01ac76 - Sigstore transparency entry: 781244259
- Sigstore integration time:
-
Permalink:
cyberlife-coder/VelesDB@17158d201788d78fa891779c87aca123a27b2f49 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/cyberlife-coder
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@17158d201788d78fa891779c87aca123a27b2f49 -
Trigger Event:
release
-
Statement type:
File details
Details for the file velesdb-0.5.0-cp311-none-win_amd64.whl.
File metadata
- Download URL: velesdb-0.5.0-cp311-none-win_amd64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09e1bf6d73b7fe1b8c90fb87e673ef7cca3ee97b6d0ba84af8ab9b23bd8cecf0
|
|
| MD5 |
110ad08af7eee5f5032a681ecc20052d
|
|
| BLAKE2b-256 |
162a1197c1c3a525eaf5a70c727781b02cd49c8ec8a14c82eefdcacca70da355
|
Provenance
The following attestation bundles were made for velesdb-0.5.0-cp311-none-win_amd64.whl:
Publisher:
pypi-publish.yml on cyberlife-coder/VelesDB
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
velesdb-0.5.0-cp311-none-win_amd64.whl -
Subject digest:
09e1bf6d73b7fe1b8c90fb87e673ef7cca3ee97b6d0ba84af8ab9b23bd8cecf0 - Sigstore transparency entry: 781244268
- Sigstore integration time:
-
Permalink:
cyberlife-coder/VelesDB@17158d201788d78fa891779c87aca123a27b2f49 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/cyberlife-coder
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@17158d201788d78fa891779c87aca123a27b2f49 -
Trigger Event:
release
-
Statement type:
File details
Details for the file velesdb-0.5.0-cp311-cp311-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: velesdb-0.5.0-cp311-cp311-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 1.5 MB
- Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4636c828b50d1839c3ff14565e676cee02fd02e5a53c68cf2b68bfe817141ee7
|
|
| MD5 |
c328de588f5e1a71ca67a98456f5198d
|
|
| BLAKE2b-256 |
c98ceb6b821cbe6a126ebc80faadc5cad2e25cd0fbc1012166e1744f951275e4
|
Provenance
The following attestation bundles were made for velesdb-0.5.0-cp311-cp311-manylinux_2_34_x86_64.whl:
Publisher:
pypi-publish.yml on cyberlife-coder/VelesDB
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
velesdb-0.5.0-cp311-cp311-manylinux_2_34_x86_64.whl -
Subject digest:
4636c828b50d1839c3ff14565e676cee02fd02e5a53c68cf2b68bfe817141ee7 - Sigstore transparency entry: 781244260
- Sigstore integration time:
-
Permalink:
cyberlife-coder/VelesDB@17158d201788d78fa891779c87aca123a27b2f49 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/cyberlife-coder
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@17158d201788d78fa891779c87aca123a27b2f49 -
Trigger Event:
release
-
Statement type: