brinicle is a C++ vector index engine (ANN library) optimized for disk-first, low-RAM similarity search.
Project description
brinicle
brinicle is a C++ retrieval engine built around disk-first, low-RAM HNSW search.
It supports:
- raw vector similarity search
- lexical, semantic, and hybrid search for structured items through one HNSW index
- autocomplete and query suggestion search
Benchmark
brinicle is designed for constrained environments where loading the full index into RAM is not practical.
In a 256MB RAM / 1 CPU container using MNIST 60K vectors:
| System | Outcome |
|---|---|
| brinicle | PASS |
| chroma | PASS |
| qdrant | OOMKilled |
| weaviate | OOMKilled |
| milvus | OOMKilled |
On SIFT 1M vectors, using the same in-process deployment model as FAISS and hnswlib:
| System | Build (s) | Recall@10 | Avg latency (ms) | QPS |
|---|---|---|---|---|
| faiss | 237.282 | 0.96999 | 0.092 | 10857.43 |
| hnswlib | 241.301 | 0.96364 | 0.093 | 10711.86 |
| brinicle | 243.75 | 0.96989 | 0.103 | 9730.65 |
In this benchmark suite, brinicle stays close to FAISS and hnswlib latency while using a disk-backed index design.
See the benchmark: brinicle benchmark
Install
Install from PyPI:
pip install brinicle
Or build from source:
git clone https://github.com/bicardinal/brinicle.git
cd brinicle
pip install -e .
Engines
brinicle exposes three engines with the same lifecycle:
client.init(...)
client.ingest(...)
client.finalize()
client.search(...)
| Engine | Use case | Input |
|---|---|---|
VectorEngine |
Raw ANN vector search | float32 vectors |
ItemSearchEngine |
Lexical / semantic / hybrid item search | title, category, subcategory, attributes, optional vectors |
AutocompleteEngine |
Query/title suggestions | suggestion text |
Features
- Disk-first HNSW vector search
- Low-RAM indexing and querying
- Streaming-first ingest: one vector/item/suggestion at a time
- Insert, upsert, delete, and compact rebuild
- Raw vector search through
VectorEngine - Structured item search through
ItemSearchEngine - Lexical, semantic, and hybrid item search using one HNSW index
- Alpha-controlled item search: lexical-only, semantic-only, or hybrid
- Autocomplete/query suggestion search through
AutocompleteEngine
Vector search
Use VectorEngine when you already have embeddings or numeric vectors.
import numpy as np
import brinicle
D = 2
n = 5
X = np.random.randn(n, D).astype(np.float32)
Q = np.random.randn(D).astype(np.float32)
engine = brinicle.VectorEngine(
"vector_index",
dim=D,
delta_ratio=0.1,
)
engine.init(mode="build")
for eid in range(n):
engine.ingest(str(eid), X[eid])
engine.finalize()
print(engine.search(Q, k=10))
search(...) returns a list of external ids:
["3", "1", "0"]
To return distances too:
print(engine.search_with_distance(Q, k=10))
To do batch search:
engine.search_batch(Qs)
Insert
Y = np.random.randn(5, D).astype(np.float32)
engine.init(mode="insert")
for eid in range(5):
engine.ingest(str(eid) + "x", Y[eid])
engine.finalize()
print(engine.search(Q, k=10))
Upsert
Y = np.random.randn(5, D).astype(np.float32)
engine.init(mode="upsert")
for eid in range(5):
engine.ingest(str(eid), Y[eid])
engine.finalize()
print(engine.search(Q, k=10))
Delete
engine.delete_items(["1", "4"])
print(engine.search(Q, k=10))
Rebuild / optimize
engine.optimize_graph()
print(engine.search(Q, k=10))
Item search
ItemSearchEngine searches catalog-like records with titles, metadata, and optional semantic vectors.
Each item can contain:
titlecategorysubcategoryattributes- an optional semantic vector
Only title is required. The other fields are optional.
ItemSearchEngine can run in three practical modes:
| Mode | How to use it |
|---|---|
| Lexical-only item search | Use structured fields only and set alpha=0.0 |
| Semantic-only item search | Provide vectors and set alpha=1.0 |
| Hybrid item search | Provide structured fields and vectors, then use an alpha between 0.0 and 1.0 |
brinicle does not build separate lexical and vector indexes for item search. Structured lexical signals and optional semantic vectors are encoded into one numeric representation and searched through the same HNSW graph.
Lexical item search
Use lexical item search when you want structured catalog search without external embeddings.
import brinicle
engine = brinicle.ItemSearchEngine(
"item_index",
dim=96, # the larger, the more embedding space, the less truncation
alpha=0.0, # lexical-only
)
engine.init(mode="build")
engine.ingest(
external_id="p1",
title="Apple iPhone 15 Pro Max 256GB Natural Titanium",
category="Electronics",
subcategory="Smartphones",
attributes={
"brand": "Apple",
"storage": "256GB",
"color": "Natural Titanium",
},
)
engine.ingest(
external_id="p2",
title="Samsung Galaxy S24 Ultra 512GB Black",
category="Electronics",
subcategory="Smartphones",
attributes={
"brand": "Samsung",
"storage": "512GB",
"color": "Black",
},
)
engine.finalize()
print(engine.search("iphone 15 pro max", k=10))
Hybrid item search
Use hybrid item search when you want exact structured signals and semantic similarity in the same retrieval path.
import numpy as np
import brinicle
VECTOR_DIM = 384
engine = brinicle.ItemSearchEngine(
"hybrid_item_index",
dim=96,
vector_dim=VECTOR_DIM,
alpha=0.95, # mostly semantic, with lexical correction
vector_normalized=True,
M=48,
ef_construction=1024,
ef_search=512,
)
engine.init(mode="build")
engine.ingest(
external_id="p1",
title="Apple iPhone 15 Pro Max 256GB Natural Titanium",
category="Electronics",
subcategory="Smartphones",
attributes={
"brand": "Apple",
"storage": "256GB",
"color": "Natural Titanium",
},
vector=np.random.randn(VECTOR_DIM).astype("float32"),
normalize=True,
)
engine.ingest(
external_id="p2",
title="Samsung Galaxy S24 Ultra 512GB Black",
category="Electronics",
subcategory="Smartphones",
attributes={
"brand": "Samsung",
"storage": "512GB",
"color": "Black",
},
vector=np.random.randn(VECTOR_DIM).astype("float32"),
normalize=True,
)
engine.finalize()
query_vector = np.random.randn(VECTOR_DIM).astype("float32")
results = engine.search(
"iphone 15 pro max",
category="Electronics",
subcategory="Smartphones",
attributes={
"brand": "Apple",
},
vector=query_vector,
normalize=True,
k=10,
)
print(results)
To return distances:
print(engine.search_with_distance("iphone 15", k=10))
Understanding alpha
alpha controls the balance between semantic vector similarity and structured lexical matching.
alpha |
Behavior |
|---|---|
0.0 |
lexical-only |
0.5 |
balanced lexical + semantic |
0.95 |
mostly semantic, with lexical correction |
1.0 |
semantic-only |
For semantic-only and hybrid search, pass vector_dim during engine construction and provide vectors during ingest(...) and search(...).
Choose alpha before building the index. In brinicle, alpha affects graph construction as well as search scoring; it is not only a query-time reranking parameter.
Autocomplete
AutocompleteEngine provides low-RAM autocomplete and query suggestion search using brinicle's HNSW infrastructure.
It can be used to index:
- popular queries
- item titles
- category names
- curated suggestions
import brinicle
ac = brinicle.AutocompleteEngine(
"autocomplete_index",
dim=48,
)
ac.init(mode="build")
ac.ingest("iphone 15 pro max", "iphone 15 pro max")
ac.ingest("iphone 15 case", "iphone 15 case")
ac.ingest("samsung s24 ultra", "samsung s24 ultra")
ac.finalize()
print(ac.search("iph", k=5))
Autocomplete currently works best for prefix-aligned query and title suggestions.
Streaming-first ingest
brinicle ingests records one at a time, so the full dataset does not need to fit in memory.
client.init(mode="build")
for item in stream_items():
client.ingest(...)
client.finalize()
Configuration
brinicle exposes common HNSW parameters:
Mef_constructionef_searchdelta_ratio
Example:
engine = brinicle.VectorEngine(
"vector_index",
dim=384,
M=48,
ef_construction=1024,
ef_search=512,
delta_ratio=0.1,
)
Item search supports alpha-controlled lexical, semantic, and hybrid scoring.
engine = brinicle.ItemSearchEngine(
"item_index",
dim=96,
vector_dim=384,
alpha=0.95,
)
Advanced users can pass a custom LexicalConfig.
cfg = brinicle.LexicalConfig()
cfg.search_title_weight = 0.60
cfg.search_category_weight = 0.15
cfg.search_subcategory_weight = 0.15
cfg.search_attr_weight = 0.1
cfg.build_title_weight = 0.6
cfg.build_category_weight = 0.15
cfg.build_subcategory_weight = 0.15
cfg.build_attr_weight = 0.1
engine = brinicle.ItemSearchEngine(
"item_index",
dim=96,
lexical_config=cfg,
)
Autocomplete also supports its own scoring configuration.
cfg = brinicle.AutocompleteConfig()
cfg.search_position_decay = 0.5
cfg.search_length_penalty = 0.2
ac = brinicle.AutocompleteEngine(
"autocomplete_index",
dim=48,
autocomplete_config=cfg,
)
Index files
For an index path such as:
engine = brinicle.VectorEngine("my_index", dim=128)
brinicle stores index files beside that base path:
my_index.main
my_index.delta
my_index.lock
High-level engines such as ItemSearchEngine and AutocompleteEngine may also store tokenizer and encoding metadata beside the index.
Which engine should I use?
Use VectorEngine for raw ANN search over embeddings or numeric vectors.
Use ItemSearchEngine for catalog-like records with titles, metadata, and optional semantic vectors:
alpha=0.0for lexical-only searchalpha=1.0for semantic-only search0.0 < alpha < 1.0for hybrid search
Use AutocompleteEngine for low-RAM query or title suggestions.
License
brinicle is licensed under the Apache License, Version 2.0.
See the LICENSE file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file brinicle-0.0.6-cp314-cp314t-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp314-cp314t-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.3 MB
- Tags: CPython 3.14t, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eaab086ef9dd4dd1f3e05ec8ff47d2f4200b28487ea90f9eeb727ccb09699ad3
|
|
| MD5 |
065607c247fdd04f40fb04c770fed10d
|
|
| BLAKE2b-256 |
400eb4c6df846893f22a8d52fadeda0e95757e0ac5b1699d2cddc6b982d7b45f
|
File details
Details for the file brinicle-0.0.6-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.14t, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bba6674c87b5d70086edf38d3c630272c37b5a4d19465dd9093c8dfa087bded6
|
|
| MD5 |
5f33e7b6b0311d0fee9f4571b64ea711
|
|
| BLAKE2b-256 |
6ff8ccd39652ee65780173560896908f98867f5a4d5eea6677cf658df10f3193
|
File details
Details for the file brinicle-0.0.6-cp314-cp314-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp314-cp314-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.14, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9e9c0a1095243f775dbddde3f30710ea1a56de5e5fd340ff1f432682255cd4e
|
|
| MD5 |
6724dc90686f1352526a167d9ca72dda
|
|
| BLAKE2b-256 |
21f97606ef0cb96f9258473e76714190caad316760fb709f88b5fb7203250e93
|
File details
Details for the file brinicle-0.0.6-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.14, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d73c264dada96a235708fb6988b775183a554ea9b91b8bfcc747006dc369c15
|
|
| MD5 |
43de257d2653981211dd837247beb42d
|
|
| BLAKE2b-256 |
4939dd6176ea204b708a4f7618fe3f5bedab6759310afdfbac1a935f62ed5adb
|
File details
Details for the file brinicle-0.0.6-cp313-cp313-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp313-cp313-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.13, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5244589580b987f8504a5ef9bf2b752e483ece1ab575db538c7f6d1d4b35fe2b
|
|
| MD5 |
16f4e821db8d07b70a92aff2a7a446b4
|
|
| BLAKE2b-256 |
b2b6158cde62cd962aa274f074b33e6b31d223df1a0d1346921cfc5a718b4981
|
File details
Details for the file brinicle-0.0.6-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.13, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a80dfb92a1c8a7e9fe6bcd7af7cd6dbbce42b84932c00eb9e48f9b118a12ec94
|
|
| MD5 |
bbe143fbb46fbc12a0f917ddea7ffe03
|
|
| BLAKE2b-256 |
6f2fa37a54bf632e9dfde24d831c8d6d7ffec3309c7976daa176c08aa1def425
|
File details
Details for the file brinicle-0.0.6-cp312-cp312-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp312-cp312-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.12, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
275d2737d8cabc62b53e90ac73381bd2201e52a4bc8ed5c515986c657279c1e0
|
|
| MD5 |
2a3f5233518b984d31b1756f76616233
|
|
| BLAKE2b-256 |
187bfece229c6a913bcdf6b9377533665663f451af5877aaedfbf29814ebfe13
|
File details
Details for the file brinicle-0.0.6-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.12, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
68809eabcaa8101e1bb8704e218c48e8a4e992aecfc7318029610b473562cd89
|
|
| MD5 |
170fe6b5d2a21404a928858d2ab43940
|
|
| BLAKE2b-256 |
0799342c03ccca240e985fd42324fda2713335e115b5fce2bf595d29470bcf8d
|
File details
Details for the file brinicle-0.0.6-cp311-cp311-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp311-cp311-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.11, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e517e0c3e8ba9c9d053af1fdb328003345c10154fa006d9ee10a0ce33884cb3c
|
|
| MD5 |
5d6caba9e8a1104f293474ba28582dfc
|
|
| BLAKE2b-256 |
21445247b99d9563ebd77456b143288de1e371dcfb15119f70de33da0af897fc
|
File details
Details for the file brinicle-0.0.6-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.11, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a9557e0c1c407e86254246978a8164d823b7f5655c37c92cf8293b90ddbdeec
|
|
| MD5 |
fad1ba16fd8968c025e5a7f10ac0c6cf
|
|
| BLAKE2b-256 |
ecfdc49b6df8facc0e27d35d350cd04136b504ece58a0cadc6487c180af4d320
|
File details
Details for the file brinicle-0.0.6-cp310-cp310-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp310-cp310-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.10, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4eaa9423d13d76b246ca49cf64416efc00ca20607d2e81de98a160fc9d99f24d
|
|
| MD5 |
1e558d446b2a23cd898f4f7f4c922fc7
|
|
| BLAKE2b-256 |
52254802a9aa659536a2529208348e6ac5449f4b666e276bd41b43be293b5783
|
File details
Details for the file brinicle-0.0.6-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.6-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.10, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ddbf568f30e37ae52e6c2df3b791e5593b2dbe7c8333c0efb18e79a84e4dfa3c
|
|
| MD5 |
8ec2ca05015a7faf30e5c9cb71d7b586
|
|
| BLAKE2b-256 |
f31821a935185dfeab173974ff5f9bb568d3c81635883678e015afb1d1dbd255
|