A simple, easy-to-hack Vector Database implementation
Project description
🌬️ A vector database implementation with single-dependency (numpy
).
🎁 It can handle a query from 100,000
vectors and return in 100 milliseconds.
🏃 It's okay for your prototypes, maybe even more.
Install
Install from PyPi
pip install nano-vectordb
Install from source
# clone this repo first
cd nano-vectordb
pip install -e .
Quick Start
Faking your data:
from nano_vectordb import NanoVectorDB
import numpy as np
data_len = 100_000
fake_dim = 1024
fake_embeds = np.random.rand(data_len, fake_dim)
fakes_data = [{"__vector__": fake_embeds[i], **ANYFIELDS} for i in range(data_len)]
You can add any fields to a data. But there are two keywords:
__id__
: If passed,NanoVectorDB
will use your id, otherwise a generated id will be used.__vector__
: must pass, your embeddingnp.ndarray
.
Init a DB:
vdb = NanoVectorDB(fake_dim, storage_file="fool.json")
Next time you init vdb
from fool.json
, NanoVectorDB
will load the index automatically.
Upsert:
r = vdb.upsert(fakes_data)
print(r["update"], r["insert"])
Query:
print(vdb.query(np.random.rand(fake_dim)))
Save:
# will create/overwrite 'fool.json'
vdb.save()
Get, Delete:
# get and delete the inserted data
print(vdb.get(r["insert"]))
vdb.delete(r["insert"])
Benchmark
Embedding Dim: 1024. Device: MacBook M3 Pro
- Save a index with
100,000
vectors will generate a roughly 520M json file. - Insert
100,000
vectors will cost roughly2
s - Query from
100,000
vectors will cost roughly0.1
s
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file nano_vectordb-0.0.3.tar.gz
.
File metadata
- Download URL: nano_vectordb-0.0.3.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ac33df057c36378ac9c7c0666aa0e99004f12175d8fa9729fee8b760da8839c |
|
MD5 | 9153ba1f9025b87156ad851832c04390 |
|
BLAKE2b-256 | e7410d9c2b8fe24676b800588e9a848929b21121cf17c265fc72305f105c570d |
File details
Details for the file nano_vectordb-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: nano_vectordb-0.0.3-py3-none-any.whl
- Upload date:
- Size: 4.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e1b0ab70962ccaea6e706be0e166adad4aa76d4bfd3ec622803833d18398200b |
|
MD5 | 08e4500e3690284da51d972987cef5ea |
|
BLAKE2b-256 | 0437368f496de77183234ada1d459b47c4efd9ec450c9f49f7565070e7715a48 |