brinicle is a C++ vector index engine (ANN library) optimized for disk-first, low-RAM similarity search.
Project description
brinicle
brinicle is a C++ retrieval engine built around disk-first, low-RAM HNSW search.
It supports:
- raw vector similarity search
- structured item search
- autocomplete/query suggestion search
Benchmark
brinicle is designed for constrained environments where loading the full index into RAM is not practical.
In a 256MB RAM / 1 CPU container on MNIST 60K vectors, the benchmark result was:
| System | Outcome |
|---|---|
| brinicle | PASS |
| chroma | PASS |
| qdrant | OOMKilled |
| weaviate | OOMKilled |
| milvus | OOMKilled |
On SIFT 1M vectors, using the same in-process deployment model as FAISS and hnswlib:
| System | Build (s) | Recall@10 | Avg latency (ms) | QPS |
|---|---|---|---|---|
| faiss | 237.282 | 0.96999 | 0.092 | 10857.43 |
| hnswlib | 241.301 | 0.96364 | 0.093 | 10711.86 |
| brinicle | 243.75 | 0.96989 | 0.103 | 9730.65 |
In brinicle's benchmark suite, it reaches latency competitive with FAISS and hnswlib while keeping the index disk-backed and memory usage low.
See the benchmark: brinicle benchmark
brinicle is designed for constrained environments where loading a full index into RAM is not practical. It keeps the same simple lifecycle across all engines:
client.init(...)
client.ingest(...)
client.finalize()
client.search(...)
Features
- Disk-first HNSW vector search
- Low-RAM indexing and querying
- Streaming-first ingest: one vector/item/suggestion at a time
- Insert, upsert, delete, and compact rebuild
- Raw vector search through
VectorEngine - Structured item search through
ItemSearchEngine - Autocomplete/query suggestion search through
AutocompleteEngine - Custom scoring for lexical item search and autocomplete
- Python bindings over a C++ core
Install
Install from PyPI:
pip install brinicle
Or build from source:
git clone https://github.com/bicardinal/brinicle.git
cd brinicle
bash build.sh
Engines
brinicle exposes three engines with the same lifecycle.
| Engine | Use case | Input |
|---|---|---|
VectorEngine |
Raw ANN vector search | float32 vectors |
ItemSearchEngine |
Structured catalog/item search | title, category, subcategory, attributes |
AutocompleteEngine |
Query/title suggestions | suggestion text |
All engines follow the same pattern:
client.init(mode="build")
for record in records:
client.ingest(...)
client.finalize()
results = client.search(...)
Vector search
Use VectorEngine when you already have embeddings or numeric vectors.
import numpy as np
import brinicle
D = 2
n = 5
X = np.random.randn(n, D).astype(np.float32)
Q = np.random.randn(D).astype(np.float32)
engine = brinicle.VectorEngine(
"vector_index",
dim=D,
delta_ratio=0.1,
)
engine.init(mode="build")
for eid in range(n):
engine.ingest(str(eid), X[eid])
engine.finalize()
print(engine.search(Q, k=10))
search(...) returns a list of external ids:
["3", "1", "0"]
To return distances too:
print(engine.search_with_distance(Q, k=10))
Insert
Y = np.random.randn(5, D).astype(np.float32)
engine.init(mode="insert")
for eid in range(5):
engine.ingest(str(eid) + "x", Y[eid])
engine.finalize()
print(engine.search(Q, k=10))
Upsert
Y = np.random.randn(5, D).astype(np.float32)
engine.init(mode="upsert")
for eid in range(5):
engine.ingest(str(eid), Y[eid])
engine.finalize()
print(engine.search(Q, k=10))
Delete
engine.delete_items(["1", "4"])
print(engine.search(Q, k=10))
Rebuild / optimize
engine.optimize_graph()
print(engine.search(Q, k=10))
Item search
ItemSearchEngine searches structured catalog-like records without requiring a traditional inverted index.
Each item can contain:
titlecategorysubcategoryattributes
Only title is required. The other fields are optional.
Items are encoded internally into fixed-size numeric representations and searched through brinicle's HNSW graph using a structured lexical scorer.
import brinicle
engine = brinicle.ItemSearchEngine(
"item_index",
dim=96,
)
engine.init(mode="build")
engine.ingest(
external_id="p1",
title="Apple iPhone 15 Pro Max 256GB Natural Titanium",
category="Electronics",
subcategory="Smartphones",
attributes={
"brand": "Apple",
"storage": "256GB",
"color": "Natural Titanium",
},
)
engine.ingest(
external_id="p2",
title="Samsung Galaxy S24 Ultra 512GB Black",
category="Electronics",
subcategory="Smartphones",
attributes={
"brand": "Samsung",
"storage": "512GB",
"color": "Black",
},
)
engine.finalize()
print(engine.search("iphone 15 pro max", k=10))
To return distances:
print(engine.search_with_distance("iphone 15", k=10))
Example with structured query fields:
results = engine.search(
"iphone 15",
category="Electronics",
subcategory="Smartphones",
attributes={
"brand": "Apple",
},
k=10,
)
What can Item Search be used for?
ItemSearchEngine is useful for structured catalog-like data such as:
- products
- movies
- books
- jobs
- real estate listings
- restaurants
- games
- records with titles and attributes
Item Search is not a neural embedding model. It uses structured symbolic encoding and a configurable scorer.
Autocomplete
AutocompleteEngine provides low-RAM autocomplete and query suggestion search using brinicle's HNSW infrastructure.
It can be used to index:
- popular queries
- item titles
- category names
- curated suggestions
import brinicle
ac = brinicle.AutocompleteEngine(
"autocomplete_index",
dim=48,
)
ac.init(mode="build")
ac.ingest("iphone 15 pro max", "iphone 15 pro max")
ac.ingest("iphone 15 case", "iphone 15 case")
ac.ingest("samsung s24 ultra", "samsung s24 ultra")
ac.finalize()
print(ac.search("iph", k=5))
AutocompleteEngine follows the same lifecycle as the other engines:
ac.init(mode="build")
ac.ingest(...)
ac.finalize()
ac.search(...)
The current autocomplete implementation is experimental and works best when query prefixes align well with encoded token prefixes.
Streaming-first ingest
brinicle does not require loading the full dataset into memory.
Ingest is intentionally one record at a time:
client.init(mode="build")
for item in stream_items():
client.ingest(...)
client.finalize()
Users can stream data from:
- JSONL files
- databases
- APIs
- object storage
- custom pipelines
brinicle does not assume that your dataset fits in RAM. Rare in modern software.
Configuration
brinicle exposes common HNSW parameters:
Mef_constructionef_searchdelta_ratio
Example:
engine = brinicle.VectorEngine(
"vector_index",
dim=384,
M=48,
ef_construction=1024,
ef_search=512,
delta_ratio=0.1,
)
Item search also supports lexical scoring configuration.
cfg = brinicle.LexicalConfig()
cfg.search_title_weight = 0.60
cfg.search_category_weight = 0.15
cfg.search_subcategory_weight = 0.15
engine = brinicle.ItemSearchEngine(
"item_index",
dim=96,
lexical_config=cfg,
)
Autocomplete also supports its own scoring configuration.
cfg = brinicle.AutocompleteConfig()
cfg.search_position_decay = 0.5
cfg.search_length_penalty = 0.2
ac = brinicle.AutocompleteEngine(
"autocomplete_index",
dim=48,
autocomplete_config=cfg,
)
Index files
For an index path such as:
engine = brinicle.VectorEngine("my_index", dim=128)
brinicle stores index files beside that base path:
my_index.main
my_index.delta
my_index.lock
High-level engines such as ItemSearchEngine and AutocompleteEngine may also store metadata such as tokenizer and encoding information beside the index.
Which engine should I use?
Use VectorEngine if you already have embeddings or numeric vectors.
Use ItemSearchEngine if you have structured catalog-like data such as products, movies, books, jobs, listings, or records with titles and attributes.
Use AutocompleteEngine if you want low-RAM query or title suggestions.
Limitations
- brinicle is not a full-text search engine.
- Item Search is designed for structured catalog-like records, not long documents.
- Item Search is symbolic/lexical, not neural semantic search.
- Autocomplete is experimental.
- Search quality depends on normalization, tokenizer behavior, and field structure.
- Large updates may require graph optimization or compact rebuild.
Roadmap
- High-level item search API
- High-level autocomplete API
- Metadata persistence for tokenizer and encoding config
- More benchmarks for item search and autocomplete
- Better prefix-aware autocomplete encoding
- Improved documentation and examples
License
brinicle is licensed under the Apache License, Version 2.0.
See the LICENSE file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file brinicle-0.0.5-cp314-cp314t-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp314-cp314t-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.14t, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f00529865f38ddd998034f56f35521cce14f9ef4f4422fa08cd8893b50040abc
|
|
| MD5 |
2189013ba5b4cd15b7a9e0322db3fc15
|
|
| BLAKE2b-256 |
48de7a8f40a96d9a4ec9e8cc5c3e66771b39f35a9cf831b31a0fc05959a65b2c
|
File details
Details for the file brinicle-0.0.5-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.14t, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b57add9f76bdfb4b69c1ae7de4b3845dd2128f151af90d691a0129a774aad64
|
|
| MD5 |
461a9152cde390d80e869d2847175cc2
|
|
| BLAKE2b-256 |
60fc888e7edea94e7098c40703f991033bb4b5331848036055c15d3c9af15c3c
|
File details
Details for the file brinicle-0.0.5-cp314-cp314-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp314-cp314-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.14, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0fa26317595d89f13faf1a078e14c6333375656c647a2f65ae00cf1af592db30
|
|
| MD5 |
6495ddae3f90566ca65180cf39cff4d9
|
|
| BLAKE2b-256 |
3f885f3da3059af3404db1567a2b9f1a51609c848fc5b12fd53c8508d332c5e7
|
File details
Details for the file brinicle-0.0.5-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.14, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5dc985267cb51291ec9a4fa93847da3104e34288ba0354a0ea432e5075840810
|
|
| MD5 |
89e5fb0e1c4fd3803aa30ac15548cb8d
|
|
| BLAKE2b-256 |
93bd685c1a6200d20b78d5ad4eb0c236912efaa052cad7ce416f889cd0c6437f
|
File details
Details for the file brinicle-0.0.5-cp313-cp313-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp313-cp313-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.13, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c7995eaf61be969285703ff73eb0ceddefdd3990ba0f25f795000659edb56c48
|
|
| MD5 |
2269e0e2521004acd9bc668d01e38424
|
|
| BLAKE2b-256 |
dc8f6fac2d119de92cbbb32528bc92ff5e5d4e1053f5dd012701fca7a8089e56
|
File details
Details for the file brinicle-0.0.5-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.13, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62caf208ddaa3ddc254bad32319a8ed2f044e115f61575c292887fd2220e7ee3
|
|
| MD5 |
c603880f828e4f3c94985fcc470e7536
|
|
| BLAKE2b-256 |
9d7470292e06af2b80a217cebc2e79d73bf21981f4b5093564a97b4a645edb81
|
File details
Details for the file brinicle-0.0.5-cp312-cp312-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp312-cp312-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.12, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71b44edd7881c00e5caf1ff9a82c31a76a08051a44603167b19a25cea4b9a53d
|
|
| MD5 |
1db3d6a5cf0f10b6b54ed8dc75e3958f
|
|
| BLAKE2b-256 |
a6ee8b4cecb2c205e79cff6534d54375429819dd6fabaab5af0e50f30fab9829
|
File details
Details for the file brinicle-0.0.5-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.12, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b06aa428f8d117ae259f50b1148c5a4cf0d73bc773867e326a04e1a0dd25e815
|
|
| MD5 |
a0cdfc7423c9ba2bdc5317a14e3776e4
|
|
| BLAKE2b-256 |
16bf75faf3087e67a041ff929adb647e5791a70080593e21ac93cfa8da4142df
|
File details
Details for the file brinicle-0.0.5-cp311-cp311-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp311-cp311-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.11, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b68bba588bdaa8283e203203d44e230258f1a2b355b6f3408c9c481d733ef369
|
|
| MD5 |
0e65d7a29d4d4bec82af8a89f6cba5b1
|
|
| BLAKE2b-256 |
fa999a4a2e6e9c284e604d6dd48db80fb1247451e0759644e329a0e3c94495cc
|
File details
Details for the file brinicle-0.0.5-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.11, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3df4d8c60934e7b5474f662a9c3ed2c0fdb83214fdef23dc6012f6372d9b229f
|
|
| MD5 |
8dfb070183c8f0116236942a2fe74fd1
|
|
| BLAKE2b-256 |
a67f56b76bf86812b86bbad0abe8da832dc62c8f7725095007f28a0588da10c0
|
File details
Details for the file brinicle-0.0.5-cp310-cp310-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp310-cp310-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.10, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5f6cd1e308df66a0fd4ef9ba64a0e5ea429c7693128f663da65aeafbff484c5
|
|
| MD5 |
a2cb72078ac9c5b13f81a7d3c9f1ecfb
|
|
| BLAKE2b-256 |
376451d19a983852985b73fade82587edcab7064ae2f95ad47292b5436343dbf
|
File details
Details for the file brinicle-0.0.5-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: brinicle-0.0.5-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.10, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7dc9bd7d4bfe2dd7996da78343cce1eaa96f30564f5b18f5dcfc2310e9c96920
|
|
| MD5 |
ef53cc1834bfd464a537383adf2cb4fb
|
|
| BLAKE2b-256 |
0c7ce82902413bff2382bfc96007fd333eb62dc145679abef650fa6fb55d68ba
|