iNatInqPerf is a benchmark implemented to evalaute performance vs. cost trade-offs of running NLP based search (like INQUIRE) in a platform with different vectorDBs (like INaturalists).
Project description
Contents
Overview
This project provides a modular benchmark pipeline for experimenting with different vector databases (FAISS, Qdrant, …).
It runs end-to-end:
- Download → Hugging Face dataset (optionally export images + manifest)
- Embed → Generate CLIP embeddings for images
- Build → Construct indexes with multiple VectorDBs
- Search → Profile queries (latency + Recall@K vs exact baseline)
- Update → Test insertions & deletions (index maintenance)
All steps are run with uv as the package manager.
How to use iNatInqPerf
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Setup environment
uv venv .venv && source .venv/bin/activate
uv sync
# Run an end-to-end benchmark (FAISS IVF+PQ vectordb) on the INQUIRE dataset.
uv run python scripts/run_benchmark.py configs/inquire_benchmark.yaml
# Spin up a 3-node Weaviate cluster (shared Docker network + RAFT) and run the benchmark.
uv run python scripts/run_benchmark.py configs/inquire_benchmark_weaviate_cluster.yaml
# Spin up a 3-node Qdrant cluster (HTTP+gRPC+p2p) and run the benchmark.
uv run python scripts/run_benchmark.py configs/inquire_benchmark_qdrant_cluster.yaml
Distributed VectorDB Deployments
- Benchmark-managed clusters. The
configs/inquire_benchmark_weaviate_cluster.yamlandconfigs/inquire_benchmark_qdrant_cluster.yamlfiles include the container descriptions thatcontainer_contextwill launch automatically before each run. Make sure no identically named containers are already running, otherwise Docker will raise a name-conflict error.
The benchmarking code will
- Download the specified dataset from the HuggingFace website.
- Embed the images using a CLIP model.
- Build a vector database index.
- Perform a search for given queries to obtain query latency, and compute Recall@K vs FAISS Flat baseline..
- Update the index.
Dataset Output Structure
data/raw/
dataset_info.json
state.json
data-00000-of-00001.arrow
images/
00000000.jpg
00000001.jpg
...
images/manifest.csv # [index,filename,label]
Supported Vector Databases
faiss.flat(exact)faiss.ivfpq(IVF + OPQ + PQ)
Profiling Outputs
- Latency statistics (avg, p50, p95)
- Recall@K vs baseline
- JSON metrics in
.results/
Profiling with py-spy
Use py-spy to record flamegraphs during any step:
bash scripts/pyspy_run.sh search-faiss -- python src/inatinqperf/benchmark/benchmark.py search --vectordb faiss.ivfpq --hf_dir data/emb_hf --topk 10 --queries src/inatinqperf/benchmark/queries.txt
Outputs:
.results/search-faiss.svg(flamegraph).results/search-faiss.speedscope.json
Installation
| Installation Method | Command |
|---|---|
| Via uv | uv add inatinqperf |
| Via pip | pip install inatinqperf |
Development
Please visit Contributing and Development for information on contributing to this project.
Additional Information
Additional information can be found at these locations.
| Title | Document | Description |
|---|---|---|
| Code of Conduct | CODE_OF_CONDUCT.md | Information about the norms, rules, and responsibilities we adhere to when participating in this open source community. |
| Contributing | CONTRIBUTING.md | Information about contributing to this project. |
| Development | DEVELOPMENT.md | Information about development activities involved in making changes to this project. |
| Governance | GOVERNANCE.md | Information about how this project is governed. |
| Maintainers | MAINTAINERS.md | Information about individuals who maintain this project. |
| Security | SECURITY.md | Information about how to privately report security issues associated with this project. |
License
iNatInqPerf is licensed under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file inatinqperf-0.1.108.tar.gz.
File metadata
- Download URL: inatinqperf-0.1.108.tar.gz
- Upload date:
- Size: 307.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d34f21fe97f500b6b71e6cde2e5c394c9c27162a799f38d06c0ac4bb77ee736f
|
|
| MD5 |
444eee913774dafe2b1500b2ede129a5
|
|
| BLAKE2b-256 |
f1dc1717770bf2996c67b042f91f060105d4418b4b2272ef5ba6b58b1a628e3b
|
File details
Details for the file inatinqperf-0.1.108-py3-none-any.whl.
File metadata
- Download URL: inatinqperf-0.1.108-py3-none-any.whl
- Upload date:
- Size: 38.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59fd06b9095249d32ef22c6e93d3c34b5cd00d7a4fa195efe33483549b427f37
|
|
| MD5 |
717fe32ed737ac38a38394fc9b138afc
|
|
| BLAKE2b-256 |
6fefd0274068cf5c6b74b68628ece20fcae8ae8308d88ba3f0c6f259263e83f0
|