Pluggable search engine integration for Django — index management, document indexing, and search queries with swappable backends.
Project description
django-icv-search
Pluggable search engine integration for Django — index management, document indexing, and search queries with swappable backends.
Part of the ICV-Django ecosystem.
Features
- Backend abstraction — swappable search backends modelled after Django's
email backend pattern; swap Meilisearch for any engine by pointing a setting
at your own
BaseSearchBackendsubclass - Meilisearch backend — default implementation using
httpxdirectly, keeping dependencies minimal and leaving the door open for async support - PostgreSQL backend — zero-infrastructure search using built-in full-text search
- Django-native filter/sort — use dicts and lists instead of engine-specific syntax
- Index management — create, configure, sync, and delete search indexes; Django is the source of truth, the engine is the secondary store
- Document indexing — add, update, delete, and bulk-index documents with
full audit logging via
IndexSyncLog - SearchableMixin — declare Django models as indexable with a small set of class attributes; override two methods for custom serialisation and queryset filtering
- Auto-indexing —
ICV_SEARCH_AUTO_INDEXwirespost_save/post_deletesignal handlers automatically; disable per-block withskip_index_update() - Multi-tenancy — optional tenant-prefixed index names via a configurable callable; no hard coupling to any specific tenant model
- Management commands — sync, reindex, health check, create, and clear indexes from the command line
- Celery integration — async document indexing and periodic sync tasks; degrades gracefully to synchronous when Celery is not installed
- DummyBackend — in-memory backend for testing without a running search
engine; ships with pytest fixtures and helper functions in
icv_search.testing - Normalised response types —
TaskResult,SearchResult, andIndexStatsdataclasses insulate consuming code from engine-specific response shapes - ICVSearchPaginator — Django
Paginatorsubclass that usesestimated_total_hitsfrom the search engine; noqueryset.count()query;is_estimatedflag for approximate display - Range filters —
__gte,__gt,__lte,__ltsuffixes on filter dict keys for numeric range queries across all backends - Facet distribution —
SearchResult.facet_distributionnormalised from the engine response, withget_facet_values()helper - Health check endpoint —
/health/JSON view for load balancer probes; include viaicv_search.urls - Zero-downtime reindex —
reindex_zero_downtime()creates a temp index, populates it, then atomically swaps with the live index - Signal debouncing —
ICV_SEARCH_DEBOUNCE_SECONDSbatches rapid auto-index signals into a single indexing call - Highlighting —
SearchResult.formatted_hitswithget_highlighted_hits()helper; supportshighlight_fields, custom pre/post tags across all three backends - Ranking scores —
SearchResult.ranking_scoreswithget_hit_with_score()helper; Meilisearch_rankingScore, PostgreSQLts_rank, DummyBackend term-frequency - Geo-distance search —
geo_point,geo_radius,geo_sortparams using Meilisearch native_geoRadius/_geoPointor Haversine on other backends;_geoDistanceon hits - Soft-delete awareness —
SearchableMixinauto-excludes soft-deleted records (is_deleted,deleted_at); auto-index removes soft-deleted instances on save - Multi-search —
multi_search()executes multiple queries in one request (nativePOST /multi-searchon Meilisearch) - Synonym/stop-word/typo management — dedicated service functions for
get_synonyms(),update_synonyms(),get_stop_words(),get_typo_tolerance(), and more - SearchQuery DSL — fluent chainable query builder:
.text().filter().sort().facets().highlight().geo_near().execute() - Search analytics —
SearchQueryLogfor individual queries andSearchQueryAggregatefor daily rollups;ICV_SEARCH_LOG_MODEcontrols strategy ("individual","aggregate","both"); sample rate control for high-traffic sites;get_popular_queries(),get_zero_result_queries(),get_search_stats(),get_query_trend()service functions - Tenant middleware —
ICVSearchTenantMiddlewareauto-injects tenant context from request;get_current_tenant_id()for automatic scoping - Search result cache — optional
ICVSearchCachelayer using Django's cache framework with automatic invalidation on index changes
Installation
Basic (standalone)
pip install django-icv-search
Add to INSTALLED_APPS:
INSTALLED_APPS = [
# ...
"icv_search",
]
Run migrations:
python manage.py migrate icv_search
With icv-core
Installing with the icv-core extra gives you BaseModel (UUID primary key
plus created_at / updated_at timestamps) from
icv-core:
pip install "django-icv-search[icv-core]"
INSTALLED_APPS = [
# ...
"icv_core",
"icv_search",
]
Both SearchIndex and IndexSyncLog inherit from icv_core.models.BaseModel
automatically when icv_core is present.
Quick Start
The following example creates a search index, indexes a handful of documents,
and runs a search against Meilisearch running on localhost:7700.
# settings.py
ICV_SEARCH_BACKEND = "icv_search.backends.meilisearch.MeilisearchBackend"
ICV_SEARCH_URL = "http://localhost:7700"
ICV_SEARCH_API_KEY = "your-meilisearch-master-key"
# 1. Make your model searchable
# myapp/models.py
from django.db import models
from icv_search.mixins import SearchableMixin
class Article(SearchableMixin, models.Model):
search_index_name = "articles"
search_fields = ["title", "body"]
search_filterable_fields = ["published", "author_id"]
search_sortable_fields = ["published_at", "title"]
title = models.CharField(max_length=200)
body = models.TextField()
author_id = models.IntegerField()
published = models.BooleanField(default=False)
published_at = models.DateTimeField(null=True)
# 2. Create the index (run once, e.g. in a migration or management command)
from icv_search.services import create_index
index = create_index("articles", model_class=Article)
# 3. Index documents
from icv_search.services import index_documents
index_documents("articles", [
{"id": "1", "title": "Django tips", "body": "...", "published": True},
{"id": "2", "title": "Search patterns", "body": "...", "published": True},
])
# 4. Search
from icv_search.services import search
results = search("articles", "django", limit=10)
for hit in results.hits:
print(hit["title"])
Configuration
Settings Reference
All settings are namespaced under ICV_SEARCH_*. Every setting has a
sensible default so the package works out of the box for local development.
| Setting | Type | Default | Description |
|---|---|---|---|
ICV_SEARCH_BACKEND |
str |
"icv_search.backends.meilisearch.MeilisearchBackend" |
Dotted path to the active search backend class |
ICV_SEARCH_URL |
str |
"http://localhost:7700" |
Search engine base URL |
ICV_SEARCH_API_KEY |
str |
"" |
Master or admin API key for the search engine |
ICV_SEARCH_TIMEOUT |
int |
30 |
Request timeout in seconds for all backend calls |
ICV_SEARCH_TENANT_PREFIX_FUNC |
str |
"" |
Dotted path to a callable (request_or_none) -> str that returns the tenant prefix. Empty string disables multi-tenancy |
ICV_SEARCH_AUTO_SYNC |
bool |
True |
Automatically push index settings to the engine when a SearchIndex record is saved |
ICV_SEARCH_ASYNC_INDEXING |
bool |
True |
Use Celery for document indexing operations. Falls back to synchronous when Celery is unavailable |
ICV_SEARCH_INDEX_PREFIX |
str |
"" |
Global prefix applied to all engine index names (e.g. "staging_" to segregate environments) |
ICV_SEARCH_AUTO_INDEX |
dict |
{} |
Automatic model-level indexing configuration. See below |
ICV_SEARCH_DEBOUNCE_SECONDS |
int |
0 |
Debounce window in seconds for auto-index signal batching. When > 0, rapid saves are collected and indexed in a single batch after this delay. Requires Django's cache framework. 0 disables debouncing |
ICV_SEARCH_LOG_QUERIES |
bool |
False |
Log every search() call to SearchQueryLog for analytics |
ICV_SEARCH_LOG_ZERO_RESULTS_ONLY |
bool |
False |
When True (and LOG_QUERIES is True), only zero-result queries are logged |
ICV_SEARCH_LOG_MODE |
str |
"individual" |
Logging strategy: "individual" writes per-query rows to SearchQueryLog, "aggregate" writes daily rollups to SearchQueryAggregate, "both" writes to both. See Search Analytics |
ICV_SEARCH_LOG_SAMPLE_RATE |
float |
1.0 |
Fraction of individual SearchQueryLog rows to write (0.0–1.0). Only applies to "individual" and "both" modes. Aggregate counts are always recorded at 100% regardless of this setting |
ICV_SEARCH_CACHE_ENABLED |
bool |
False |
Enable search result caching via Django's cache framework |
ICV_SEARCH_CACHE_TIMEOUT |
int |
60 |
Cache TTL in seconds for stored search results |
ICV_SEARCH_CACHE_ALIAS |
str |
"default" |
Django cache alias used by the search result cache |
Auto-Indexing Configuration
ICV_SEARCH_AUTO_INDEX wires post_save and post_delete signal handlers
automatically for any model you declare. The package's AppConfig.ready() reads
this setting and connects the handlers on startup.
Each key in the dict is the logical index name. The value is a configuration dict with the following keys:
| Key | Type | Default | Description |
|---|---|---|---|
model |
str |
required | "app_label.ModelName" — the Django model to watch |
on_save |
bool |
True |
Index the document when the model instance is saved |
on_delete |
bool |
True |
Remove the document when the model instance is deleted |
async |
bool |
from ICV_SEARCH_ASYNC_INDEXING |
Override async behaviour for this index only |
auto_create |
bool |
True |
Create the SearchIndex record and engine index if they do not yet exist |
should_update |
str |
"" |
Dotted path to a callable (instance) -> bool. When provided, the document is only indexed when the callable returns True |
updated_field |
str |
"" |
Field name used to filter records for incremental reindexing (reserved for future use) |
Example — multiple models:
ICV_SEARCH_AUTO_INDEX = {
"articles": {
"model": "blog.Article",
"on_save": True,
"on_delete": True,
"async": True,
"auto_create": True,
"should_update": "blog.search.should_index_article",
},
"products": {
"model": "catalogue.Product",
"on_save": True,
"on_delete": True,
"async": False, # Synchronous for this index
},
}
# blog/search.py
def should_index_article(instance) -> bool:
"""Only index published articles."""
return instance.published
SearchableMixin
Add SearchableMixin to any Django model to make it indexable. Declare the
index configuration as class attributes.
from django.db import models
from icv_search.mixins import SearchableMixin
class Product(SearchableMixin, models.Model):
# Required: the logical name of the search index
search_index_name = "products"
# Fields included in full-text search
search_fields = ["name", "description", "sku"]
# Fields that can be used in filter expressions
search_filterable_fields = ["category_id", "is_active", "price"]
# Fields that can be used in sort expressions
search_sortable_fields = ["price", "created_at", "name"]
name = models.CharField(max_length=200)
description = models.TextField()
sku = models.CharField(max_length=50, unique=True)
category_id = models.IntegerField()
price = models.DecimalField(max_digits=10, decimal_places=2)
is_active = models.BooleanField(default=True)
created_at = models.DateTimeField(auto_now_add=True)
Customising the document representation
Override to_search_document() to control exactly what is sent to the engine.
The default implementation includes id and all fields listed in
search_fields, converting dates to ISO strings and other non-primitive types
to strings.
def to_search_document(self) -> dict:
return {
"id": str(self.id),
"name": self.name,
"description": self.description,
"sku": self.sku,
"price": float(self.price), # Decimal -> float for JSON
"category_id": self.category_id,
"is_active": self.is_active,
"category_name": self.category.name, # Denormalised for search
}
Customising the reindex queryset
Override get_search_queryset() to control which records are included in a
full reindex and to add select_related / prefetch_related for performance.
@classmethod
def get_search_queryset(cls):
return (
cls.objects
.filter(is_active=True)
.select_related("category")
)
Service API
Import service functions from icv_search.services:
from icv_search.services import (
# Index management
create_index, delete_index, update_index_settings,
get_index_settings, get_index_stats,
# Synonym / stop-word / typo management
get_synonyms, update_synonyms, reset_synonyms,
get_stop_words, update_stop_words, reset_stop_words,
get_typo_tolerance, update_typo_tolerance,
# Document operations
index_documents, remove_documents,
index_model_instances, reindex_all, reindex_zero_downtime,
# Search
search, multi_search, get_task,
# Analytics
get_popular_queries, get_zero_result_queries,
get_search_stats, get_query_trend,
clear_query_logs, clear_query_aggregates,
# Utilities
get_current_tenant_id, ICVSearchCache,
)
Index Management
create_index
def create_index(
name: str,
tenant_id: str = "",
settings: dict | None = None,
primary_key: str = "id",
model_class: type | None = None,
) -> SearchIndex:
Creates a SearchIndex record, provisions the index in the engine, and pushes
any settings. If model_class is provided and it uses SearchableMixin, its
field lists seed the index settings automatically.
from icv_search.services import create_index
from myapp.models import Product
index = create_index(
name="products",
model_class=Product, # Reads search_filterable_fields etc.
settings={"rankingRules": ["words", "typo", "proximity"]}, # Overrides model
)
delete_index
def delete_index(name_or_index: str | SearchIndex, tenant_id: str = "") -> None:
Deletes the SearchIndex record from Django and removes the index from the
engine. Raises SearchBackendError on engine failure.
from icv_search.services import delete_index
delete_index("products")
update_index_settings
def update_index_settings(
name_or_index: str | SearchIndex,
settings: dict,
tenant_id: str = "",
) -> SearchIndex:
Merges settings into the existing index settings, saves to Django, and syncs
to the engine.
from icv_search.services import update_index_settings
index = update_index_settings("products", {
"synonyms": {"phone": ["mobile", "handset"]},
})
get_index_stats
def get_index_stats(name_or_index: str | SearchIndex, tenant_id: str = "") -> IndexStats:
Returns a normalised IndexStats dataclass with live data from the engine.
from icv_search.services import get_index_stats
stats = get_index_stats("products")
print(stats.document_count)
print(stats.is_indexing)
Document Operations
index_documents
def index_documents(
name_or_index: str | SearchIndex,
documents: list[dict],
tenant_id: str = "",
primary_key: str = "id",
) -> TaskResult:
Adds or updates documents in the search index. Returns a TaskResult.
from icv_search.services import index_documents
result = index_documents("products", [
{"id": "abc123", "name": "Widget", "price": 9.99},
{"id": "def456", "name": "Gadget", "price": 24.99},
])
print(result.task_uid)
remove_documents
def remove_documents(
name_or_index: str | SearchIndex,
document_ids: list[str],
tenant_id: str = "",
) -> TaskResult:
Removes documents from the index by their primary key values.
from icv_search.services import remove_documents
remove_documents("products", ["abc123", "def456"])
index_model_instances
def index_model_instances(
model_class: type,
queryset=None,
batch_size: int = 1000,
) -> int:
Indexes model instances using their SearchableMixin configuration. Iterates
the queryset in batches to avoid loading the entire dataset into memory. Returns
the total number of documents indexed.
from icv_search.services import index_model_instances
from myapp.models import Product
count = index_model_instances(Product, batch_size=500)
print(f"Indexed {count} products")
reindex_all
def reindex_all(
name_or_index: str | SearchIndex,
model_class: type,
tenant_id: str = "",
batch_size: int = 1000,
) -> int:
Full reindex: clears all existing documents, then re-indexes from the model's
get_search_queryset(). Use index_model_instances instead if you do not want
to clear first.
from icv_search.services import reindex_all
from myapp.models import Product
total = reindex_all("products", Product, batch_size=500)
print(f"Reindexed {total} products")
reindex_zero_downtime
def reindex_zero_downtime(
name_or_index: str | SearchIndex,
model_class: type,
tenant_id: str = "",
batch_size: int = 1000,
) -> int:
Zero-downtime reindex: creates a temporary index with the same settings,
populates it from the model queryset, then atomically swaps it with the live
index. The old index is deleted after the swap. Falls back to reindex_all()
if the backend does not support index swaps.
from icv_search.services import reindex_zero_downtime
from myapp.models import Product
total = reindex_zero_downtime("products", Product, batch_size=500)
Search
Filters and sort orders use Django-native syntax — the service layer translates them to each engine's native format automatically. This means the same calling code works across all backends.
from icv_search.services import search
# Django-native filter dict
result = search("products", "padel", filter={"category": "equipment", "is_active": True})
# Django-native sort list (- prefix = descending)
result = search("products", "", sort=["-price", "name"])
# Combined
result = search("products", "padel",
filter={"city": "Madrid", "is_active": True},
sort=["-created_at"],
limit=10,
)
search
def search(
name_or_index: str | SearchIndex,
query: str,
tenant_id: str = "",
**params,
) -> SearchResult:
Executes a search query and returns a normalised SearchResult. Additional
keyword arguments are passed to the engine (e.g. filter, sort, limit,
offset, facets).
from icv_search.services import search
# Basic search
results = search("products", "widget")
# With Django-native filter dict and sort list
results = search(
"products",
"widget",
filter={"is_active": True, "price__lt": 50},
sort=["-price"],
limit=20,
offset=0,
)
for hit in results.hits:
print(hit["name"], hit["price"])
print(f"About {results.estimated_total_hits} results")
Pagination
ICVSearchPaginator wraps a SearchResult for use with Django's pagination
machinery (ListView, templates). It uses estimated_total_hits as the count
instead of running a separate queryset.count() query.
from icv_search import ICVSearchPaginator
from icv_search.services import search
# In a view
page_number = int(request.GET.get("page", 1))
per_page = 25
result = search("products", query, limit=per_page, offset=(page_number - 1) * per_page)
paginator = ICVSearchPaginator(result, per_page=per_page)
page_obj = paginator.get_page(page_number)
# In a template
{% for hit in page_obj %}
{{ hit.name }}
{% endfor %}
{% if page_obj.is_estimated %}
{{ page_obj.display_count }} results
{% else %}
{{ page_obj.paginator.count }} results
{% endif %}
Facets
When requesting facets from the search engine, the normalised facet_distribution
is available directly on SearchResult:
result = search("products", "shoes", facets=["brand", "colour"])
print(result.facet_distribution)
# {"brand": {"Nike": 42, "Adidas": 31}, "colour": {"black": 55, "white": 28}}
# Convenience helper — sorted by count descending
for facet in result.get_facet_values("brand"):
print(f"{facet['name']}: {facet['count']}")
Range Filters
Use Django-style lookup suffixes for numeric range queries:
result = search("products", "",
filter={"price__gte": 10, "price__lte": 100, "is_active": True},
)
Supported suffixes: __gte (>=), __gt (>), __lte (<=), __lt (<).
Works across all backends.
Bulk Operations
skip_index_update
A context manager that temporarily disables auto-indexing signal handlers.
Use this in bulk imports, data migrations, and test factories to avoid
triggering individual index updates for every save() call.
from icv_search.auto_index import skip_index_update
from myapp.models import Article
articles = [Article(title=f"Article {i}") for i in range(1000)]
with skip_index_update():
Article.objects.bulk_create(articles)
# No search index updates during this block
# Trigger a single reindex after the bulk operation
from icv_search.services import reindex_all
reindex_all("articles", Article)
The context manager is nestable. Auto-indexing resumes when the outermost
with block exits.
Highlighting
Pass highlight_fields to get highlighted versions of matching text:
result = search("articles", "django tips",
highlight_fields=["title", "body"],
highlight_pre_tag="<mark>", # default
highlight_post_tag="</mark>", # default
)
# Highlighted versions of each hit
for hit in result.get_highlighted_hits():
print(hit["title"]) # "...about <mark>Django</mark> <mark>tips</mark>..."
# Or access directly
result.formatted_hits # list of highlighted hit dicts
Works across all backends: Meilisearch uses native _formatted, PostgreSQL uses
ts_headline(), DummyBackend wraps matching substrings.
Ranking Scores
Request ranking scores to understand result relevance:
result = search("products", "shoes", show_ranking_score=True)
for i, hit in enumerate(result.hits):
hit, score = result.get_hit_with_score(i)
print(f"{hit['name']}: {score:.2f}")
Meilisearch returns _rankingScore (0.0–1.0), PostgreSQL uses ts_rank, and
DummyBackend computes a simple term-frequency score.
Geo-Distance Search
Filter and sort results by geographic distance:
# Find restaurants within 5km of a point
result = search("restaurants", "",
geo_point=(51.5074, -0.1278), # London (lat, lng)
geo_radius=5000, # metres
geo_sort="asc", # nearest first
)
for hit in result.hits:
print(f"{hit['name']}: {hit.get('_geoDistance')}m away")
Models with geo data should declare it on the mixin:
class Restaurant(SearchableMixin, models.Model):
search_index_name = "restaurants"
search_fields = ["name", "cuisine"]
search_lat_field = "latitude"
search_lng_field = "longitude"
latitude = models.FloatField()
longitude = models.FloatField()
Multi-Search
Execute multiple queries in a single request:
from icv_search.services import multi_search
results = multi_search([
{"index_name": "products", "query": "shoes", "limit": 5},
{"index_name": "articles", "query": "shoes", "limit": 3, "facets": ["category"]},
])
product_results, article_results = results
Meilisearch uses the native POST /multi-search endpoint. Other backends
execute queries sequentially.
Synonym and Stop-Word Management
from icv_search.services import (
get_synonyms, update_synonyms, reset_synonyms,
get_stop_words, update_stop_words, reset_stop_words,
get_typo_tolerance, update_typo_tolerance,
)
# Synonyms
update_synonyms("products", {"phone": ["mobile", "handset"], "laptop": ["notebook"]})
print(get_synonyms("products"))
reset_synonyms("products")
# Stop words
update_stop_words("products", ["the", "a", "an", "is"])
print(get_stop_words("products"))
# Typo tolerance
update_typo_tolerance("products", {"enabled": True, "minWordSizeForTypos": {"oneTypo": 4}})
SearchQuery Builder
A fluent API for building search queries:
from icv_search import SearchQuery
results = (
SearchQuery("products")
.text("running shoes")
.filter(brand="Nike", price__gte=50)
.sort("-price", "name")
.facets("brand", "category")
.highlight("name", "description")
.geo_near(lat=51.5, lng=-0.12, radius=5000)
.with_ranking_scores()
.limit(20)
.execute()
)
# Or get a paginator directly
paginator = SearchQuery("products").text("shoes").limit(25).paginate()
page = paginator.get_page(1)
Search Analytics
Enable query logging to track search behaviour:
# settings.py
ICV_SEARCH_LOG_QUERIES = True
ICV_SEARCH_LOG_ZERO_RESULTS_ONLY = False # Set True to reduce storage
Logging strategies
ICV_SEARCH_LOG_MODE controls how queries are recorded:
| Mode | Storage | Best for |
|---|---|---|
"individual" (default) |
One SearchQueryLog row per query |
Low/medium traffic sites that need full query history |
"aggregate" |
Daily rollups in SearchQueryAggregate |
High-traffic sites where individual rows would be too large |
"both" |
Both individual rows and daily rollups | Sites that want detailed logs for recent queries plus long-term trends |
# settings.py — high-traffic configuration
ICV_SEARCH_LOG_QUERIES = True
ICV_SEARCH_LOG_MODE = "aggregate" # Daily rollups only
ICV_SEARCH_LOG_SAMPLE_RATE = 1.0 # Not applicable in aggregate-only mode
# Or keep individual logs with sampling
ICV_SEARCH_LOG_MODE = "both"
ICV_SEARCH_LOG_SAMPLE_RATE = 0.1 # Write only 10% of individual rows
Aggregate queries are normalised (stripped and lowercased) for consistent grouping. The sample rate only affects SearchQueryLog rows — aggregate counts always reflect 100% of queries.
Analytics service functions
from icv_search.services import (
get_popular_queries,
get_zero_result_queries,
get_search_stats,
get_query_trend,
clear_query_logs,
clear_query_aggregates,
)
# Most frequent queries in the last 7 days
popular = get_popular_queries("products", days=7, limit=20)
# Queries that returned no results
gaps = get_zero_result_queries("products", days=7)
# Aggregate stats
stats = get_search_stats("products", days=7)
# {"total_queries": 1234, "avg_processing_time_ms": 12, "zero_result_rate": 0.05}
# Day-by-day trend for a specific query (reads from SearchQueryAggregate)
trend = get_query_trend("running shoes", "products", days=30)
# [{"date": date(2026, 3, 1), "count": 42, "zero_result_count": 3, "avg_processing_time_ms": 8.5}, ...]
# Cleanup
deleted = clear_query_logs(days_older_than=30)
deleted = clear_query_aggregates(days_older_than=90)
All analytics functions (get_popular_queries, get_zero_result_queries, get_search_stats) automatically read from the correct model based on ICV_SEARCH_LOG_MODE.
Tenant Middleware
Auto-inject tenant context from the request instead of passing tenant_id
on every call:
# settings.py
MIDDLEWARE = [
# ...
"icv_search.middleware.ICVSearchTenantMiddleware",
]
ICV_SEARCH_TENANT_PREFIX_FUNC = "myproject.search.get_tenant_prefix"
# In a view — tenant_id is injected automatically
results = search("products", "widget") # No tenant_id needed
# Explicit tenant_id always takes precedence
results = search("products", "widget", tenant_id="other_tenant")
Search Result Cache
Enable caching to reduce backend load for repeated queries:
# settings.py
ICV_SEARCH_CACHE_ENABLED = True
ICV_SEARCH_CACHE_TIMEOUT = 60 # seconds
ICV_SEARCH_CACHE_ALIAS = "default" # Django cache alias
Cache is automatically invalidated when documents are indexed or removed.
Queries with a user param bypass the cache (analytics-aware searches may
vary by user).
Response Types
All service functions return normalised dataclasses, insulating your code from engine-specific response shapes.
TaskResult
Returned by document indexing and deletion operations.
@dataclass
class TaskResult:
task_uid: str # Engine-assigned task identifier
status: str # Task status (e.g. "enqueued", "succeeded")
detail: str # Operation type or description
raw: dict # Original engine response
SearchResult
Returned by search().
@dataclass
class SearchResult:
hits: list[dict] # Matching documents
query: str # The query string as echoed by the engine
processing_time_ms: int # Time taken by the engine (milliseconds)
estimated_total_hits: int # Approximate total matching documents
limit: int # Page size applied
offset: int # Offset applied
facet_distribution: dict[str, dict[str, int]] # Facet counts by field
formatted_hits: list[dict] # Highlighted versions of hits
ranking_scores: list[float | None] # Relevance scores per hit
raw: dict # Original engine response
def get_highlighted_hits(self) -> list[dict]: ...
def get_facet_values(facet_name: str) -> list[dict]: ...
def get_hit_with_score(index: int) -> tuple[dict, float | None]: ...
IndexStats
Returned by get_index_stats().
@dataclass
class IndexStats:
document_count: int # Number of indexed documents
is_indexing: bool # Whether the engine is currently indexing
field_distribution: dict[str, int] # Field name -> document count
raw: dict # Original engine response
Backends
Meilisearch (default)
Requires a running Meilisearch instance (v1.0+). Uses httpx directly rather
than the official SDK, keeping dependencies minimal.
Required settings:
ICV_SEARCH_BACKEND = "icv_search.backends.meilisearch.MeilisearchBackend"
ICV_SEARCH_URL = "http://localhost:7700"
ICV_SEARCH_API_KEY = "your-master-key" # Leave blank if no auth configured
PostgreSQL (zero infrastructure)
Uses Django's built-in django.contrib.postgres.search to provide full-text
search without any external services. Documents are stored in PostgreSQL tables
with tsvector indexing.
ICV_SEARCH_BACKEND = "icv_search.backends.postgres.PostgresBackend"
# ICV_SEARCH_URL and ICV_SEARCH_API_KEY are ignored by this backend.
The backend automatically creates its tables on first use — no additional migrations required. Supports:
- Full-text search with ranking (ts_rank)
- Django-native filter dicts
- Django-native sort lists
- searchableAttributes from index settings
Best for projects that want search without running Meilisearch, or as a starting point before upgrading to a dedicated search engine.
DummyBackend (testing)
An in-memory backend that stores documents in module-level dicts. No running search engine required. Supports basic substring search, limit, and offset.
# tests/settings.py
ICV_SEARCH_BACKEND = "icv_search.backends.dummy.DummyBackend"
ICV_SEARCH_ASYNC_INDEXING = False # Keep tests synchronous
See the Testing section for fixtures and helpers.
Writing a Custom Backend
Subclass BaseSearchBackend and implement all abstract methods. Point
ICV_SEARCH_BACKEND at the dotted path to your class.
# myproject/search_backends.py
from icv_search.backends.base import BaseSearchBackend
class TypesenseBackend(BaseSearchBackend):
def __init__(self, url: str, api_key: str, timeout: int = 30, **kwargs):
super().__init__(url=url, api_key=api_key, timeout=timeout, **kwargs)
# Initialise your HTTP client here
def create_index(self, uid: str, primary_key: str = "id") -> dict: ...
def delete_index(self, uid: str) -> None: ...
def update_settings(self, uid: str, settings: dict) -> dict: ...
def get_settings(self, uid: str) -> dict: ...
def add_documents(self, uid: str, documents: list[dict], primary_key: str = "id") -> dict: ...
def delete_documents(self, uid: str, document_ids: list[str]) -> dict: ...
def clear_documents(self, uid: str) -> dict: ...
def search(self, uid: str, query: str, **params) -> dict: ...
def get_stats(self, uid: str) -> dict: ...
def health(self) -> bool: ...
# settings.py
ICV_SEARCH_BACKEND = "myproject.search_backends.TypesenseBackend"
Raise icv_search.exceptions.SearchBackendError on failure so the service
layer handles errors consistently.
Management Commands
| Command | Purpose |
|---|---|
icv_search_setup [--dry-run] |
Recommended first step. Creates SearchIndex records for all entries in ICV_SEARCH_AUTO_INDEX, syncs settings to the engine, and verifies connectivity. Use --dry-run to preview without making changes |
icv_search_health [--verbose] |
Check engine connectivity; --verbose prints per-index document counts and sync status |
icv_search_sync [--index NAME] [--force] [--tenant TENANT] |
Push index settings from Django to the engine; without --force, skips indexes already marked as synced |
icv_search_reindex --index NAME --model DOTTED.PATH [--batch-size N] [--tenant TENANT] |
Clear and re-index from get_search_queryset() in batches (default 1000) |
icv_search_create_index --name NAME [--primary-key FIELD] [--tenant TENANT] |
Create a SearchIndex record and provision it in the engine |
icv_search_clear --index NAME [--tenant TENANT] |
Remove all documents from an index without deleting it |
# First-time setup — creates all indexes from ICV_SEARCH_AUTO_INDEX
python manage.py icv_search_setup
# Preview what would be created
python manage.py icv_search_setup --dry-run
# Other commands
python manage.py icv_search_health --verbose
python manage.py icv_search_sync --index products --force
python manage.py icv_search_reindex --index products --model myapp.models.Product --batch-size 500
python manage.py icv_search_create_index --name orders --primary-key order_id
python manage.py icv_search_clear --index products
Note:
SearchIndexrecords are also auto-created on first use — callingsearch("products", "shoes")will create theSearchIndexrecord automatically if it does not exist. Theicv_search_setupcommand is the recommended way to provision indexes explicitly during deployment.
Celery Tasks
Celery is optional. When not installed the shared_task decorator is replaced
with a no-op and all operations run synchronously. When installed with
ICV_SEARCH_ASYNC_INDEXING = True, operations are dispatched as background
tasks with exponential backoff (max three retries).
| Task | Signature | Purpose |
|---|---|---|
sync_index_settings |
(index_pk) |
Push settings for one index |
sync_all_indexes |
() |
Sync all unsynced active indexes (periodic, every 5 min) |
add_documents |
(index_pk, documents, primary_key="id") |
Add/update documents |
remove_documents |
(index_pk, document_ids) |
Remove documents |
reindex |
(index_pk, model_path, batch_size=1000) |
Full reindex from model queryset |
refresh_document_counts |
() |
Refresh cached document_count from engine stats (periodic, hourly) |
reindex_zero_downtime_task |
(index_pk, model_path, batch_size=1000) |
Zero-downtime reindex via index swap |
flush_debounce_buffer |
(index_pk) |
Drain debounce buffer and batch-index buffered documents |
cleanup_search_query_logs |
(days_older_than=30) |
Delete old search query log entries (periodic, daily) |
cleanup_search_query_aggregates |
(days_older_than=90) |
Delete old search query aggregate rows (periodic, daily) |
# Celery Beat schedule
from celery.schedules import crontab
CELERY_BEAT_SCHEDULE = {
"icv-search-sync-all": {
"task": "icv_search.tasks.sync_all_indexes",
"schedule": crontab(minute="*/5"),
},
"icv-search-refresh-counts": {
"task": "icv_search.tasks.refresh_document_counts",
"schedule": crontab(minute=0),
},
"icv-search-cleanup-query-logs": {
"task": "icv_search.tasks.cleanup_search_query_logs",
"schedule": crontab(hour=3, minute=0), # daily at 03:00
},
"icv-search-cleanup-query-aggregates": {
"task": "icv_search.tasks.cleanup_search_query_aggregates",
"schedule": crontab(hour=3, minute=15), # daily at 03:15
},
}
Signals
All signals are defined in icv_search.signals. Connect in your consuming
project to react to search index lifecycle events.
| Signal | Sender | Kwargs | When |
|---|---|---|---|
search_index_created |
SearchIndex |
instance |
After a new index is created and provisioned |
search_index_deleted |
SearchIndex |
instance |
After an index is deleted from Django and the engine |
search_index_synced |
SearchIndex |
instance |
After settings are pushed to the engine successfully |
documents_indexed |
SearchIndex |
instance, count, document_ids |
After documents are added or updated |
documents_removed |
SearchIndex |
instance, count, document_ids |
After documents are removed |
from django.dispatch import receiver
from icv_search.signals import documents_indexed
from icv_search.models import SearchIndex
@receiver(documents_indexed, sender=SearchIndex)
def on_documents_indexed(sender, instance, count, document_ids, **kwargs):
print(f"{count} documents indexed in '{instance.name}'")
Testing
Using DummyBackend
Configure the dummy backend in your test settings:
# tests/settings.py
ICV_SEARCH_BACKEND = "icv_search.backends.dummy.DummyBackend"
ICV_SEARCH_ASYNC_INDEXING = False # Synchronous so assertions work immediately
Test Fixtures
icv_search.testing provides ready-made fixtures and factories:
# conftest.py
from icv_search.testing.fixtures import search_backend, search_index # noqa: F401
| Fixture | What it does |
|---|---|
search_backend |
Configures DummyBackend, resets state before and after the test |
search_index |
Creates a SearchIndex instance via SearchIndexFactory |
Factories (icv_search.testing.factories):
| Factory | Model |
|---|---|
SearchIndexFactory |
SearchIndex |
IndexSyncLogFactory |
IndexSyncLog |
SearchQueryAggregateFactory |
SearchQueryAggregate |
Asserting Documents
Inspect the DummyBackend's in-memory state directly:
from icv_search.backends.dummy import _documents
def test_article_is_indexed(db, search_backend):
article = ArticleFactory()
# After save, the document should be in the dummy backend
docs = _documents.get(article.search_index_name, {})
assert str(article.pk) in docs
assert docs[str(article.pk)]["title"] == article.title
Use the provided helper functions for common assertions:
from icv_search.testing.helpers import (
get_indexed_documents,
get_dummy_indexes,
assert_document_indexed,
)
def test_product_indexed(db, search_backend):
product = ProductFactory()
assert_document_indexed("products", str(product.pk))
def test_all_products_indexed(db, search_backend):
ProductFactory.create_batch(5)
docs = get_indexed_documents("products")
assert len(docs) == 5
skip_index_update in Tests
Use skip_index_update() in test factories and fixtures to prevent auto-index
noise when creating supporting data that is not the subject of the test:
# tests/factories.py
import factory
from icv_search.auto_index import skip_index_update
class ArticleFactory(factory.django.DjangoModelFactory):
class Meta:
model = Article
@classmethod
def _create(cls, model_class, *args, **kwargs):
with skip_index_update():
return super()._create(model_class, *args, **kwargs)
Multi-Tenancy
In a multi-tenant application, each tenant's search index is distinguished by
a prefix on the engine_uid. The tenant_id is stored as a plain CharField
on SearchIndex — there is no foreign key to a tenant model, so icv-search has
no dependency on any specific tenant implementation.
Configure the prefix callable:
# myproject/search.py
def get_tenant_prefix(request_or_none) -> str:
"""Return the current tenant's slug for use as an index prefix."""
if request_or_none and hasattr(request_or_none, "tenant"):
return request_or_none.tenant.slug
return ""
# settings.py
ICV_SEARCH_TENANT_PREFIX_FUNC = "myproject.search.get_tenant_prefix"
How engine_uid is computed:
engine_uid = {ICV_SEARCH_INDEX_PREFIX}{tenant_id}_{name}
= "staging_acme_products" (prefix="staging_", tenant="acme", name="products")
= "acme_products" (no prefix, tenant="acme", name="products")
= "products" (single-tenant — no prefix, no tenant)
At save time, the callable is invoked with None as the request argument
(there is no HTTP request context during a model save). For request-scoped
prefix resolution, pass tenant_id explicitly when calling service functions:
from icv_search.services import search
results = search("products", "widget", tenant_id=request.tenant.slug)
Omit ICV_SEARCH_TENANT_PREFIX_FUNC (leave it as "") for single-tenant
deployments — all indexes exist in a flat namespace.
Roadmap
- SQLite FTS5 backend
- MySQL FULLTEXT backend
- Async (
httpx.AsyncClient) support for ASGI applications - Typesense backend
- Search result click-through tracking
- A/B testing for ranking rules
- PostGIS-backed geo search (production-grade alternative to Haversine)
Licence
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file django_icv_search-0.3.3.tar.gz.
File metadata
- Download URL: django_icv_search-0.3.3.tar.gz
- Upload date:
- Size: 150.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7279215f0a551e98eed2330b38cb4c22d7337195f6870c12060b72bab871adaa
|
|
| MD5 |
71b989770e56973774edec7c84101e01
|
|
| BLAKE2b-256 |
396f30b6f9e51924e19710832fdad6a30378363efcb6578664af5a53fa940555
|
Provenance
The following attestation bundles were made for django_icv_search-0.3.3.tar.gz:
Publisher:
publish-search.yml on nigelcopley/icv-django
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
django_icv_search-0.3.3.tar.gz -
Subject digest:
7279215f0a551e98eed2330b38cb4c22d7337195f6870c12060b72bab871adaa - Sigstore transparency entry: 1155171311
- Sigstore integration time:
-
Permalink:
nigelcopley/icv-django@834d2065ee8f84a3fb7c4d561920c19eac4ca7db -
Branch / Tag:
refs/tags/icv-search/v0.3.3 - Owner: https://github.com/nigelcopley
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-search.yml@834d2065ee8f84a3fb7c4d561920c19eac4ca7db -
Trigger Event:
push
-
Statement type:
File details
Details for the file django_icv_search-0.3.3-py3-none-any.whl.
File metadata
- Download URL: django_icv_search-0.3.3-py3-none-any.whl
- Upload date:
- Size: 94.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3702face80184b4f6ef93cd0de355080e9161b32290f5990d15e81a3e6a152f3
|
|
| MD5 |
5bdab61eca079d66ef68cd9f73e4d7e9
|
|
| BLAKE2b-256 |
5c2ad4deccb45cb579533fb61aced1d0298586f96beb7a9cd6d299327d2c9abc
|
Provenance
The following attestation bundles were made for django_icv_search-0.3.3-py3-none-any.whl:
Publisher:
publish-search.yml on nigelcopley/icv-django
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
django_icv_search-0.3.3-py3-none-any.whl -
Subject digest:
3702face80184b4f6ef93cd0de355080e9161b32290f5990d15e81a3e6a152f3 - Sigstore transparency entry: 1155171316
- Sigstore integration time:
-
Permalink:
nigelcopley/icv-django@834d2065ee8f84a3fb7c4d561920c19eac4ca7db -
Branch / Tag:
refs/tags/icv-search/v0.3.3 - Owner: https://github.com/nigelcopley
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-search.yml@834d2065ee8f84a3fb7c4d561920c19eac4ca7db -
Trigger Event:
push
-
Statement type: