An embeddable, in-process search engine written in Rust
Project description
lucisearch
The SQLite of Search — an embeddable, in-process search engine.
No cluster to manage. No HTTP layer. No JVM. pip install and search.
pip install lucisearch
Quick Start
import luci
# Create an index with field mappings
index = luci.Index.create("products.luci", {
"properties": {
"title": {"type": "text"},
"description": {"type": "text"},
"category": {"type": "keyword"},
"price": {"type": "float"},
"in_stock": {"type": "boolean"},
}
})
# Index documents
index.bulk([
{"title": "Wireless Headphones", "description": "Noise-cancelling bluetooth headphones", "category": "electronics", "price": 79.99, "in_stock": True},
{"title": "Running Shoes", "description": "Lightweight trail running shoes", "category": "sports", "price": 129.99, "in_stock": True},
{"title": "Coffee Maker", "description": "Programmable drip coffee maker", "category": "kitchen", "price": 49.99, "in_stock": False},
])
# Search
results = index.search({"match": {"title": "headphones"}}, 10)
for hit in results["hits"]:
print(f'{hit["_score"]:.2f} {hit["_source"]["title"]}')
Queries
Luci supports the Elasticsearch query DSL. Pass any query as a Python dict.
Full-text search
# Single field
index.search({"match": {"title": "running shoes"}}, 10)
# Multiple fields
index.search({"multi_match": {"query": "wireless", "fields": ["title", "description"]}}, 10)
# Exact phrase
index.search({"match_phrase": {"description": "trail running"}}, 10)
Filtering and boolean logic
# Term query (exact match on keyword fields)
index.search({"term": {"category": "electronics"}}, 10)
# Bool query — combine must, should, must_not, filter
index.search({
"bool": {
"must": [{"match": {"title": "shoes"}}],
"filter": [
{"term": {"in_stock": True}},
{"range": {"price": {"lte": 100}}},
]
}
}, 10)
# Prefix, wildcard, regexp, fuzzy
index.search({"prefix": {"category": "elec"}}, 10)
index.search({"fuzzy": {"title": {"value": "headphoens", "fuzziness": 1}}}, 10)
Sorting and pagination
# Sort by field
results = index.search({
"query": {"match_all": {}},
"sort": [{"price": "asc"}],
"size": 10
})
# Pagination with from/size
results = index.search({
"query": {"match_all": {}},
"sort": ["price"],
"from": 20,
"size": 10
})
# Cursor-based pagination with search_after
results = index.search({
"query": {"match_all": {}},
"sort": ["price"],
"size": 10,
"search_after": [49.99]
})
Aggregations
# Terms aggregation
results = index.search({
"query": {"match_all": {}},
"aggs": {"categories": {"terms": {"field": "category"}}},
"size": 0
})
for bucket in results["aggregations"]["categories"]["buckets"]:
print(f'{bucket["key"]}: {bucket["doc_count"]}')
# Metric aggregations
results = index.search({
"query": {"match_all": {}},
"aggs": {
"avg_price": {"avg": {"field": "price"}},
"price_stats": {"stats": {"field": "price"}},
"price_ranges": {"range": {
"field": "price",
"ranges": [{"to": 50}, {"from": 50, "to": 100}, {"from": 100}]
}},
},
"size": 0
})
# Nested aggregations
results = index.search({
"query": {"match_all": {}},
"aggs": {"by_category": {
"terms": {"field": "category"},
"aggs": {"avg_price": {"avg": {"field": "price"}}},
}},
"size": 0
})
Highlighting
results = index.search({
"query": {"match": {"description": "coffee"}},
"highlight": {
"fields": {"description": {}},
"pre_tags": ["<b>"],
"post_tags": ["</b>"],
}
})
for hit in results["hits"]:
print(hit.get("highlight", {}))
Vector search (kNN)
# Create index with vector field
index = luci.Index.create("vectors.luci", {
"properties": {
"title": {"type": "text"},
"embedding": {"type": "dense_vector", "dims": 384},
}
})
# kNN search
results = index.search({
"knn": {
"field": "embedding",
"query_vector": [0.1, 0.2, ...], # 384-dim vector
"k": 10,
"num_candidates": 50,
}
}, 10)
# Hybrid search — text + vector combined via RRF
results = index.search({
"query": {"match": {"title": "headphones"}},
"knn": {
"field": "embedding",
"query_vector": query_vector,
"k": 10,
"num_candidates": 50,
}
}, 10)
Geospatial queries
# Create index with geo fields
index = luci.Index.create("places.luci", {
"properties": {
"name": {"type": "text"},
"location": {"type": "geo_point"},
}
})
# Geo distance
index.search({
"geo_distance": {
"distance": "10km",
"location": {"lat": 40.7128, "lon": -74.0060}
}
}, 10)
# Geo bounding box
index.search({
"geo_bounding_box": {
"location": {
"top_left": {"lat": 41.0, "lon": -74.5},
"bottom_right": {"lat": 40.5, "lon": -73.5}
}
}
}, 10)
Document CRUD
# Add with explicit ID
index.add({"_id": "prod-1", "title": "Widget", "price": 9.99})
# Get by ID
doc = index.get("prod-1")
# Update (partial merge)
index.update("prod-1", {"price": 7.99})
# Delete by ID
index.delete("prod-1")
# Delete by query
index.delete_by_query({"term": {"category": "discontinued"}})
# Count
count = index.count({"term": {"in_stock": True}})
Field Types
| Type | Description |
|---|---|
text |
Full-text search with BM25 scoring and analysis |
keyword |
Exact match, sorting, aggregations |
integer, long |
Signed integers |
float, double |
Floating point numbers |
boolean |
true / false |
date |
Date/time values |
dense_vector |
Fixed-dimension float vectors (cosine, L2, dot product; int8 quantization) |
geo_point |
Latitude/longitude pairs |
geo_shape |
Polygons, multipolygons with spatial relations |
nested |
Arrays of objects with independent field scoping |
Features
- Full-text search with BM25 scoring, analyzers, phrase queries, fuzzy matching
- Vector search with HNSW, int8 quantization, pre-filtering
- Hybrid search with Reciprocal Rank Fusion (RRF)
- 20+ aggregation types — terms, avg, sum, min, max, stats, range, histogram, cardinality, percentiles, date_histogram, geo_bounds, filters, nested, and more
- Geospatial — geo_distance, geo_bounding_box, geo_shape with all spatial relations
- Nested documents with block-join queries and inner_hits
- Highlighting with custom tags, per-field configuration
- Sort by field — keyword, numeric, score, with multi-level sort
- Pagination —
from/sizeand cursor-basedsearch_after - Collapse — deduplicate results by a keyword field
- Explain — BM25 score breakdowns
- Rescore — two-phase scoring with custom query weights
- Single-file storage — one
.lucifile, no directory sprawl - Auto-commit — documents are searchable immediately after
add()orbulk() - ES-compatible JSON query DSL — same queries, same field types
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
lucisearch-0.2.1.tar.gz
(347.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lucisearch-0.2.1.tar.gz.
File metadata
- Download URL: lucisearch-0.2.1.tar.gz
- Upload date:
- Size: 347.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c209083a1a7e30582d6d789155ccf16006d5de5b787d071caccdf19ce2f07ba3
|
|
| MD5 |
8177b89d8a16ccc8f0c1789ec379188f
|
|
| BLAKE2b-256 |
85f3d7ff6122751b57b9f90cce6436e411d780ab03d12458735d23d6375e3942
|
File details
Details for the file lucisearch-0.2.1-cp39-cp39-macosx_11_0_arm64.whl.
File metadata
- Download URL: lucisearch-0.2.1-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.7 MB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8eed899d61673b09149443b98a9485e560f903dd3adf95e1996bb0101b68194d
|
|
| MD5 |
a1aa023d7807357498da6a918a667353
|
|
| BLAKE2b-256 |
29800d18c238205e6f1cfe46c982a09effc7a31e5d4a42033a94b806c4ceec4b
|