No project description provided

These details have not been verified by PyPI

Project description

picovdb

An extremely fast, ultra-lightweight local vector database in Python.

"extremely fast": sub-millisecond query

"ultra-lighweight": One file with Numpy and one optional dependency faiss-cpu.

Install

pip install picovdb

Usage

Create a db:

(Use SentenceTransformer embedding as example)

from sentence_transformers import SentenceTransformer
from picovdb import PicoVectorDB

CHUNK_SIZE = 256
model = SentenceTransformer('all-MiniLM-L6-v2')
dim = model.get_sentence_embedding_dimension()

with open('A_Christmas_Carol.txt', encoding='UTF8') as f:
    content = f.read()
    num_chunks = len(content) // CHUNK_SIZE + 1
    chunks = [content[i * CHUNK_SIZE: (i + 1) * CHUNK_SIZE] for i in range(num_chunks)]
    embeddings = model.encode(chunks)
    data = [
        {
            "_vector_": embeddings[i],
            "_id_": i,
            "content": chunks[i],
        }
        for i in range(num_chunks)
    ]
    db = PicoVectorDB(embedding_dim=dim, storage_file='_acc')
    db.upsert(data)
    db.save()

Query

db = PicoVectorDB(embedding_dim=dim, storage_file='_acc')
txt = "Are there no prisons? Are there no workhouses?"
emb = model.encode(txt)
q = db.query(emb, top_k=3)
print('query results:', q)

Benchmark

Embedding Dim: 1024.

Hardware: M3 MacBook Air

Pure Python:
- Inserting 100,000 vectors took about 0.5s
- Doing 100 queries from 100,000 vectors took roughly 0.6s (0.006s per quiry).
With FAISS(cpu):
- Inserting 100,000 vectors took 110s
- Doing 100 queries from 100,000 vectors took 0.05s (0.0005s or 0.5 millisecond per quiry).
- Doing 1000 queries from 100,000 vectors in batch mode took 0.2s (0.0002s or 0.2 millisecond per quiry).

Hardware: PC with CPU Core i7-12700k and old-gen M2 Nvme SSD

Pure Python:
- Inserting 100,000 vectors took about 0.7s
- Doing 100 queries from 100,000 vectors took roughly 1.3s (0.013s per quiry).
With FAISS(cpu):
- Inserting 100,000 vectors took 50s
- Doing 100 queries from 100,000 vectors took 0.05s (0.0005s or 0.5 millisecond per quiry).
- Doing 1000 queries from 100,000 vectors in batch mode took 0.3s (0.0003s or 0.3 millisecond per quiry).

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.1

Sep 6, 2025

0.2.0

Sep 5, 2025

0.1.2

May 8, 2025

This version

0.1.1

May 3, 2025

0.1.0

Apr 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

picovdb-0.1.1.tar.gz (6.5 kB view details)

Uploaded May 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

picovdb-0.1.1-py3-none-any.whl (7.2 kB view details)

Uploaded May 3, 2025 Python 3

File details

Details for the file picovdb-0.1.1.tar.gz.

File metadata

Download URL: picovdb-0.1.1.tar.gz
Upload date: May 3, 2025
Size: 6.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.2 CPython/3.10.12 Linux/5.15.0-135-generic

File hashes

Hashes for picovdb-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`0939fa40f6bded3a87cb7ad01e61158078280641975cc5e265761b811c3746bf`
MD5	`cae96ec8a0dd25d5770b8fc2bbfcf82f`
BLAKE2b-256	`332cdf02c41eafc8dd9c2dc11200764c112b95940466bfaa7c887aa6be6dadf2`

See more details on using hashes here.

File details

Details for the file picovdb-0.1.1-py3-none-any.whl.

File metadata

Download URL: picovdb-0.1.1-py3-none-any.whl
Upload date: May 3, 2025
Size: 7.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.2 CPython/3.10.12 Linux/5.15.0-135-generic

File hashes

Hashes for picovdb-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fe341abf87cf21aee79e2a1a20185e104ce3359d00a423ef05d7854c54634117`
MD5	`a9db3c491333fb7391d1dd74ff06de9f`
BLAKE2b-256	`bd268b419f2dfac861aaffc53d105a940c57478523a5d096ba9fe03f3c84f793`

See more details on using hashes here.

picovdb 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

picovdb

Install

Usage

Benchmark

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes