A lightweight vector database with incremental inserts, automatic exact-to-ANN switching, and explicit storage management. Add vectors anytime without rebuilding the index

These details have not been verified by PyPI

Project links

Project description

🐦 Vinkra

Vinkra Logo

Vector Incremental Nano Kit — Reconfigurated Automatically

“Vector DB that self-organize. Auto-switch, Auto-tune, Auto-scale.”

[!WARNING] This project is currently in pre-alpha.

Table of Contents generated with DocToc

🤔 So What's vinkra Anyway? (And Why Should You Care?)
📦 Installation
- The Quick & Easy Way
- The From-Source Way
✅ Proof It Works
🚀 Usage
🚨 Exceptions (API)
🗺 Features & Roadmap
🔧 Core Dependencies
🤝 Contributing
📜 License

🤔 So What's vinkra Anyway? (And Why Should You Care?)

Most vector databases force a trade-off: you either over-engineer for small datasets or hit a performance cliff as you scale. You’re left babysitting indices, manually tuning parameters, and praying your hardware can keep up.

Vink eliminates the guesswork. It automatically switches from Exact Search (for 100% precision) to ANN (for massive scale with IVF-PQ) based on dataset size and runtime latency. Whether you are running on a mobile device or a high-end server, Vink adapts its optimization strategy to your hardware and data distribution.

Feature	Why it's awesome
➕ Incremental Inserts	Add vectors anytime. Your index grows with your data, not against it.
📟 Hardware-Aware Auto-Switch	It figures out when to ditch exact search and switch to ANN based on latency prediction.
⚙️ Self-Tuning Engine	Background reconfiguration keeps clusters fresh as your data evolves.
🎯 Production-Ready Search	Filtered searches, soft deletes, compact, dual-metric (Euclidean + cosine).
💾 Explicit Storage	Disk or memory — you control where your data lives.

Unlike enterprise solutions (Milvus, Pinecone) that require complex Docker or cloud setup, Vink runs entirely local — zero dependencies beyond pip install.

And that's just the start - there's plenty more to explore!

📦 Installation

First ensure that you have the necessary system dependencies installed.

Linux only: Required for building rii

# Debian/Ubuntu
sudo apt-get install python3-dev

# RedHat/Fedora/CentOS
sudo dnf install python3-devel -y

# CentOS 7 and older
sudo yum install python3-devel

Android/Termux:

pkg install -y tur-repo
pkg install python-scipy

The Quick & Easy Way

The simplest way to get started is with pip:

pip install vinkra

The From-Source Way

Prefer building from source? You can clone and install manually for full control:

git clone https://github.com/speedyk-005/vinkra.git
cd vinkra
pip install -e .

(But honestly, the pip way is usually way easier!)

✅ Proof It Works

Run the demo to see auto-switch in action:

# Install and run anywhere
curl -O https://raw.githubusercontent.com/speedyk-005/vinkra/main/demo_poc.py
python demo_poc.py

The demo uses:

switch_latency_ms=120 (vs 300 default) — triggers switch sooner
dim=128
Batches of 10,000 vectors

The switch happens when latency exceeds switch_latency_ms. A Power Law model (y = a * x^b) continuously tunes itself from actual search latencies to predict future performance. New vectors are buffered during the switch with zero downtime.

Results vary by hardware and system load — faster machines switch later, and running other programs will affect timing.

Example output:

┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Vectors ┃      Strategy      ┃ Avg Query (ms) ┃ Insert Time (s) ┃     Status     ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ 10,000  │    exact_search    │     32.486     │      0.806      │  Exact Search  │
│ 20,000  │    exact_search    │     79.690     │      0.729      │  Exact Search  │
│ 30,000  │    exact_search    │    107.419     │      0.720      │  Exact Search  │
│ 40,000  │    exact_search    │    188.063     │      0.771      │  ⚙ Building ANN │
│ 50,000  │ approximate_search │     0.000      │     10.051      │  ✓ ANN Active  │
│ 60,000  │ approximate_search │    155.239     │      1.323      │  ✓ ANN Active  │
└─────────┴────────────────────┴────────────────┴─────────────────┴────────────────┘

✓ ANN switch successfully triggered!

🚀 Usage

Initialization (VinkraDB API)

from vinkra import VinkraDB

# Create a database with 128-dimensional vectors
db = VinkraDB("./data", dim=128)

# Or use in-memory mode (no persistence)
db = VinkraDB(":memory:", dim=128)

# With custom settings
db = VinkraDB(
    dir_path="./data",
    dim=384,
    metric="euclidean",       # or "cosine" (default: euclidean)
    force_exact=False,         # or True to disable ANN (default: False)
    ann_config=None,           # ANNConfig for PQ/OPQ (default: auto-generated)
    switch_latency_ms=300,    # ms threshold for ANN switch (default: 300)
    embedding_callback=None,  # fn to generate embeddings from content
    overwrite=False,          # overwrite existing index (default: False)
    verbose=False              # enable verbose output (default: False)
)

AnnConfig (API)

Want custom ANN settings?

from vinkra import AnnConfig

config = AnnConfig(
    num_subspaces=16,        # number of sub-vectors (default: 32)
    quantizer="pq",           # "pq" or "opq" (default: pq)
    codebook_size=128,        # centroids per subspace (default: 256)
)
db = VinkraDB("./data", dim=384, ann_config=config)

# print all available options:
AnnConfig.help()

Add (API)

Records need:

content (required): text to store
embedding (required if no callback): list of floats or numpy array, shape (d,) or (1, d)
id (optional): valid UUIDv7
metadata (optional): dict of key-value pairs

Provide embeddings directly or use a callback to generate them on the fly.

With embedding callback

db = VinkraDB("./data", dim=384, embedding_callback=my_embedding_fn)

# Just provide content — embeddings generated automatically
db.add([
    {"content": "Hello world", "metadata": {"source": "doc1"}},
    {"content": "Another text"},
])

Without callback

Provide embeddings directly:

db.add([
    {"content": "Hello world", "embedding": [0.1] * 384, "metadata": {"source": "doc1"}},
    {"content": "Another text", "embedding": [0.2] * 384}}
)]

Search (API)

Results include:

id: vector ID
content: text content
distance: similarity score (lower is closer for euclidean)
metadata: key-value pairs
embedding: (only if include_vectors=True)

Without filters

# Basic search
results = db.search(query_vec=[0.1] * 384, top_k=5)

# Include embeddings in results
results = db.search(query_vec=[0.1] * 384, include_vectors=True)

With filters

Filter syntax supports ==, !=, >, <, >=, <= with strings, numbers, and booleans. More operators coming in future updates.

results = db.search(
    query_vec=[0.1] * 384,
    top_k=10,
    filters=["source == 'doc1'", "score >= 50", "new == True"]
)

Delete

Soft deletion (API)

Soft-delete vectors by ID without rebuilding the index — fast and efficient.

# IDs come from search results or when adding
db.soft_delete(["0192a5b4-7f3c-7d6e-9a1b-2c3d4e5f6a7b", "0192a5b4-7f3c-7d6e-9a1b-2c3d4e5f6a7c"])

Compaction (API)

Actually remove soft-deleted records and reclaim storage:

db.compact()

[!WARNING] Can take 20-200+ seconds with approximate strategy depending on data size. Run during maintenance windows or off-peak hours. If not enough vectors remain to retrain the codec, rebuild is skipped.

Stats (API)

Get database statistics:

stats = db.stats()
# {
#     "version": "...",
#     "dimension": 128,
#     "metric": "euclidean",
#     "strategy": "exact_search",
#     "last_saved_at": "...",
#     "last_deleted_at": "...",
#     "active_count": 1000,
#     "deleted_count": 5
# }

🚨 Exceptions (API)

Something go wrong?

Exception	When it hits
`InvalidInputError`	Bad data or invalid params
`VectorDimensionError`	Embedding dim mismatch
`InvalidIdError`	Malformed UUIDv7
`FilterError`	Bad filter syntax

🗺 Features & Roadmap

Incremental Inserts
Hardware-Aware Auto-Switch
Soft deletes + compact
Save/Load
Filter DSL
- basic filters: Quick Comparison
- Complex Filters: Content Matching, Null Checks, date/time literals, ...
Recovery: recover soft-deleted vectors
Collections: Multi-collection support for managing multiple indices
CLI - command-line interface
REST API: HTTP API for remote vector operations
Integrations: LangChain, LlamaIndex, and other integrations

🔧 Core Dependencies

rii — C++ ANN library with pybind11 bindings (IVF-PQ index storage)
nanopq — Pure Python PQ encoding/decoding
scipy — Scientific computing (distance calculations)
numpy — Numerical computing
SQLite — Metadata storage (content, embeddings, metadata), filtering queries

🤝 Contributing

Bug fixes, features, docs — all welcome. Check out CONTRIBUTING.md for the full details.

📜 License

Check out the LICENSE file for all the details.

MIT License. Use freely, modify boldly, and credit appropriately! (We're not that legendary... yet 😉)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0a1 pre-release

Apr 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vinkra-0.1.0a1.tar.gz (42.5 kB view details)

Uploaded Apr 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vinkra-0.1.0a1-py3-none-any.whl (37.8 kB view details)

Uploaded Apr 27, 2026 Python 3

File details

Details for the file vinkra-0.1.0a1.tar.gz.

File metadata

Download URL: vinkra-0.1.0a1.tar.gz
Upload date: Apr 27, 2026
Size: 42.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for vinkra-0.1.0a1.tar.gz
Algorithm	Hash digest
SHA256	`8d7b85f7afd70bd573e477d9a03ab62114845d140c2c52997d4f5feea4eb6282`
MD5	`e3d84cb1b8a2f0a6ef4c9884ccc65da2`
BLAKE2b-256	`514ce65c239c3724d57df40c47c958b22b7138d545bc844a5ead5d686e7df11c`

See more details on using hashes here.

File details

Details for the file vinkra-0.1.0a1-py3-none-any.whl.

File metadata

Download URL: vinkra-0.1.0a1-py3-none-any.whl
Upload date: Apr 27, 2026
Size: 37.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for vinkra-0.1.0a1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4e0f63efdc3d0a860ac8f7de3153e113d9a309987bad349018aa222b81b1f5b3`
MD5	`724bbd1698dd04018be43c5bd75c9eac`
BLAKE2b-256	`952b7a9100c99ce3e78f8fee709410ce3bc6143a94511790ae7701b9d351e588`

See more details on using hashes here.

vinkra 0.1.0a1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🐦 Vinkra

Table of Contents

🤔 So What's vinkra Anyway? (And Why Should You Care?)

📦 Installation

The Quick & Easy Way

The From-Source Way

✅ Proof It Works

🚀 Usage

Initialization (VinkraDB API)

AnnConfig (API)

Add (API)

With embedding callback

Without callback

Search (API)

Without filters

With filters

Delete

Soft deletion (API)

Compaction (API)

Stats (API)

🚨 Exceptions (API)

🗺 Features & Roadmap

🔧 Core Dependencies

🤝 Contributing

📜 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes