Analytical vector database storage estimator
Project description
vector-db-sizer
What it is
vector-db-sizer is an analytical CLI estimator for vector database disk and RAM sizing.
When to use it
Use it for fast pre-implementation sizing work, such as:
- early architecture decisions;
- comparing vector dimensions;
- comparing engines;
- comparing index types;
- estimating metadata/payload impact;
- generating Markdown/CSV/JSON artifacts for architecture discussions.
What it does not do
- No live database connections.
- No ingestion or load execution.
- No latency/recall benchmarking.
- No pricing calculations.
- No production guarantee.
Quick start
uv sync
uv run vector-db-sizer validate examples/qdrant_text_hnsw.yaml
uv run vector-db-sizer estimate examples/qdrant_text_hnsw.yaml --format markdown --out report.md
uv run vector-db-sizer estimate examples/multi_scenario.yaml --format csv --out comparison.csv
Input YAML
name: qdrant_text_hnsw
dataset:
source_type: text
total_tokens: 50000000
chunk_tokens: 512
chunk_overlap: 64
embedding:
kind: dense
dimensions: 1536
dtype: float32
database:
engine: qdrant
index_type: hnsw
Single-scenario example
uv run vector-db-sizer estimate examples/qdrant_text_hnsw.yaml --format markdown
Multi-scenario example
uv run vector-db-sizer estimate examples/multi_scenario.yaml --format csv
uv run vector-db-sizer estimate examples/multi_scenario.yaml --format json
Output formats
json(machine-readable)markdown(human report)csv(comparison table)
Supported engines
- generic
- pgvector
- qdrant
- milvus
- elasticsearch
- opensearch
- weaviate
- pinecone
How to interpret the report
- Raw vectors: uncompressed/base vector bytes.
- Quantized vectors: additional quantized representation when modeled.
- Record payload: IDs + metadata/text/provenance payload bytes.
- Index disk: index structure bytes on disk.
- Engine overhead: engine/profile-level overhead approximation.
- Final disk estimate: replicated storage plus WAL/snapshot/safety factors.
- Final RAM estimate: vectors + payload + index + overhead RAM approximation.
- Warnings: profile caveats and scenario assumptions to review.
- Confidence: per-component confidence levels for planning.
Confidence levels
high: formulaic or type-level estimate.medium: useful engineering approximation.low: heuristic and engine-dependent; validate with pilot load.
Production sizing warning
The estimates are analytical and should be calibrated with a representative pilot load before production capacity planning.
Development
uv sync
uv run pytest
uv run ruff check .
Current limitations
- Engine profiles are approximate.
- No vendor pricing model.
- No actual DB measurements from live systems.
- No latency/recall estimation.
- No automatic database selection.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vector_db_sizer-0.1.0.tar.gz.
File metadata
- Download URL: vector_db_sizer-0.1.0.tar.gz
- Upload date:
- Size: 39.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71cb8b74976bb42e5111f5d8a0d2d518678223bbdaa147f6673f0555c291f136
|
|
| MD5 |
c9c5f64c09c95c2a970bea30738ce131
|
|
| BLAKE2b-256 |
faf0bd260fed695623ceebd17b61a9b91526f079eaa0a0606a3e7ead51480e9f
|
File details
Details for the file vector_db_sizer-0.1.0-py3-none-any.whl.
File metadata
- Download URL: vector_db_sizer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bad0f1c9a65061c8a25a81cdf775aa3c1efcf8c9a09790eba74b9bee9749e46
|
|
| MD5 |
5217bd706153ec515f48af45fb476aa9
|
|
| BLAKE2b-256 |
50a775bb92e35b69310730f3844eff546cf1f861a1ec28e821604429637010da
|