A simple, easy-to-hack Vector Database implementation
Project description
🌬️ A vector database implementation with single-dependency (numpy
).
🎁 It can handle a query from 100,000
vectors and return in 100 milliseconds.
🏃 It's okay for your prototypes, maybe even more.
Install
Install from PyPi
pip install nano-vectordb
Install from source
# clone this repo first
cd nano-vectordb
pip install -e .
Quick Start
Faking your data:
from nano_vectordb import NanoVectorDB
import numpy as np
data_len = 100_000
fake_dim = 1024
fake_embeds = np.random.rand(data_len, fake_dim)
fakes_data = [{"__vector__": fake_embeds[i], **ANYFIELDS} for i in range(data_len)]
You can add any fields to a data. But there are two keywords:
__id__
: If passed,NanoVectorDB
will use your id, otherwise a generated id will be used.__vector__
: must pass, your embeddingnp.ndarray
.
Init a DB:
vdb = NanoVectorDB(fake_dim, storage_file="fool.json")
Next time you init vdb
from fool.json
, NanoVectorDB
will load the index automatically.
Upsert:
r = vdb.upsert(fakes_data)
print(r["update"], r["insert"])
Query:
print(vdb.query(np.random.rand(fake_dim)))
Save:
# will create/overwrite 'fool.json'
vdb.save()
Get, Delete:
# get and delete the inserted data
print(vdb.get(r["insert"]))
vdb.delete(r["insert"])
Benchmark
Embedding Dim: 1024. Device: MacBook M3 Pro
- Save a index with
100,000
vectors will generate a roughly 520M json file. - Insert
100,000
vectors will cost roughly2
s - Query from
100,000
vectors will cost roughly0.1
s
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nano_vectordb-0.0.4.tar.gz
(5.0 kB
view hashes)
Built Distribution
Close
Hashes for nano_vectordb-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13ac63d4620687a06ef8efb10aaa74444d2f835f82490ac6a0e12a09fca15446 |
|
MD5 | 4245e4ef9f1a118fb7bb015fbae887d1 |
|
BLAKE2b-256 | bbb2d80b22b9e76d0db54f9ad9176d6eced0ff69fc90aade23f308613c9a3dff |