Pluggable search engine integration for Django — index management, document indexing, and search queries with swappable backends.

These details have not been verified by PyPI

Project description

django-icv-search

Pluggable search engine integration for Django — index management, document indexing, and search queries with swappable backends.

Part of the ICV-Django ecosystem.

Features

Backend abstraction — swappable search backends modelled after Django's email backend pattern; swap Meilisearch for any engine by pointing a setting at your own BaseSearchBackend subclass
Meilisearch backend — default implementation using httpx directly, keeping dependencies minimal and leaving the door open for async support
PostgreSQL backend — zero-infrastructure search using built-in full-text search
Django-native filter/sort — use dicts and lists instead of engine-specific syntax
Index management — create, configure, sync, and delete search indexes; Django is the source of truth, the engine is the secondary store
Document indexing — add, update, delete, and bulk-index documents with full audit logging via IndexSyncLog
SearchableMixin — declare Django models as indexable with a small set of class attributes; override two methods for custom serialisation and queryset filtering
Auto-indexing — ICV_SEARCH_AUTO_INDEX wires post_save / post_delete signal handlers automatically; disable per-block with skip_index_update()
Multi-tenancy — optional tenant-prefixed index names via a configurable callable; no hard coupling to any specific tenant model
Management commands — sync, reindex, health check, create, and clear indexes from the command line
Celery integration — async document indexing and periodic sync tasks; degrades gracefully to synchronous when Celery is not installed
DummyBackend — in-memory backend for testing without a running search engine; ships with pytest fixtures and helper functions in icv_search.testing
Normalised response types — TaskResult, SearchResult, and IndexStats dataclasses insulate consuming code from engine-specific response shapes
ICVSearchPaginator — Django Paginator subclass that uses estimated_total_hits from the search engine; no queryset.count() query; is_estimated flag for approximate display
Range filters — __gte, __gt, __lte, __lt suffixes on filter dict keys for numeric range queries across all backends
Facet distribution — SearchResult.facet_distribution normalised from the engine response, with get_facet_values() helper
Health check endpoint — /health/ JSON view for load balancer probes; include via icv_search.urls
Zero-downtime reindex — reindex_zero_downtime() creates a temp index, populates it, then atomically swaps with the live index
Signal debouncing — ICV_SEARCH_DEBOUNCE_SECONDS batches rapid auto-index signals into a single indexing call
Highlighting — SearchResult.formatted_hits with get_highlighted_hits() helper; supports highlight_fields, custom pre/post tags across all three backends
Ranking scores — SearchResult.ranking_scores with get_hit_with_score() helper; Meilisearch _rankingScore, PostgreSQL ts_rank, DummyBackend term-frequency
Geo-distance search — geo_point, geo_radius, geo_sort params using Meilisearch native _geoRadius/_geoPoint or Haversine on other backends; _geoDistance on hits
Soft-delete awareness — SearchableMixin auto-excludes soft-deleted records (is_deleted, deleted_at); auto-index removes soft-deleted instances on save
Multi-search — multi_search() executes multiple queries in one request (native POST /multi-search on Meilisearch)
Synonym/stop-word/typo management — dedicated service functions for get_synonyms(), update_synonyms(), get_stop_words(), get_typo_tolerance(), and more
SearchQuery DSL — fluent chainable query builder: .text().filter().sort().facets().highlight().geo_near().execute()
Search analytics — SearchQueryLog for individual queries and SearchQueryAggregate for daily rollups; ICV_SEARCH_LOG_MODE controls strategy ("individual", "aggregate", "both"); sample rate control for high-traffic sites; get_popular_queries(), get_zero_result_queries(), get_search_stats(), get_query_trend() service functions
Tenant middleware — ICVSearchTenantMiddleware auto-injects tenant context from request; get_current_tenant_id() for automatic scoping
Search result cache — optional ICVSearchCache layer using Django's cache framework with automatic invalidation on index changes

Installation

Basic (standalone)

pip install django-icv-search

Add to INSTALLED_APPS:

INSTALLED_APPS = [
    # ...
    "icv_search",
]

Run migrations:

python manage.py migrate icv_search

With icv-core

Installing with the icv-core extra gives you BaseModel (UUID primary key plus created_at / updated_at timestamps) from icv-core:

pip install "django-icv-search[icv-core]"

INSTALLED_APPS = [
    # ...
    "icv_core",
    "icv_search",
]

Both SearchIndex and IndexSyncLog inherit from icv_core.models.BaseModel automatically when icv_core is present.

Quick Start

The following example creates a search index, indexes a handful of documents, and runs a search against Meilisearch running on localhost:7700.

# settings.py
ICV_SEARCH_BACKEND = "icv_search.backends.meilisearch.MeilisearchBackend"
ICV_SEARCH_URL = "http://localhost:7700"
ICV_SEARCH_API_KEY = "your-meilisearch-master-key"

# 1. Make your model searchable
# myapp/models.py
from django.db import models
from icv_search.mixins import SearchableMixin

class Article(SearchableMixin, models.Model):
    search_index_name = "articles"
    search_fields = ["title", "body"]
    search_filterable_fields = ["published", "author_id"]
    search_sortable_fields = ["published_at", "title"]

    title = models.CharField(max_length=200)
    body = models.TextField()
    author_id = models.IntegerField()
    published = models.BooleanField(default=False)
    published_at = models.DateTimeField(null=True)

# 2. Create the index (run once, e.g. in a migration or management command)
from icv_search.services import create_index
index = create_index("articles", model_class=Article)

# 3. Index documents
from icv_search.services import index_documents
index_documents("articles", [
    {"id": "1", "title": "Django tips", "body": "...", "published": True},
    {"id": "2", "title": "Search patterns", "body": "...", "published": True},
])

# 4. Search
from icv_search.services import search
results = search("articles", "django", limit=10)
for hit in results.hits:
    print(hit["title"])

Configuration

Settings Reference

All settings are namespaced under ICV_SEARCH_*. Every setting has a sensible default so the package works out of the box for local development.

Setting	Type	Default	Description
`ICV_SEARCH_BACKEND`	`str`	`"icv_search.backends.meilisearch.MeilisearchBackend"`	Dotted path to the active search backend class
`ICV_SEARCH_URL`	`str`	`"http://localhost:7700"`	Search engine base URL
`ICV_SEARCH_API_KEY`	`str`	`""`	Master or admin API key for the search engine
`ICV_SEARCH_TIMEOUT`	`int`	`30`	Request timeout in seconds for all backend calls
`ICV_SEARCH_TENANT_PREFIX_FUNC`	`str`	`""`	Dotted path to a callable `(request_or_none) -> str` that returns the tenant prefix. Empty string disables multi-tenancy
`ICV_SEARCH_AUTO_SYNC`	`bool`	`True`	Automatically push index settings to the engine when a `SearchIndex` record is saved
`ICV_SEARCH_ASYNC_INDEXING`	`bool`	`True`	Use Celery for document indexing operations. Falls back to synchronous when Celery is unavailable
`ICV_SEARCH_INDEX_PREFIX`	`str`	`""`	Global prefix applied to all engine index names (e.g. `"staging_"` to segregate environments)
`ICV_SEARCH_AUTO_INDEX`	`dict`	`{}`	Automatic model-level indexing configuration. See below
`ICV_SEARCH_DEBOUNCE_SECONDS`	`int`	`0`	Debounce window in seconds for auto-index signal batching. When > 0, rapid saves are collected and indexed in a single batch after this delay. Requires Django's cache framework. `0` disables debouncing
`ICV_SEARCH_LOG_QUERIES`	`bool`	`False`	Log every `search()` call to `SearchQueryLog` for analytics
`ICV_SEARCH_LOG_ZERO_RESULTS_ONLY`	`bool`	`False`	When `True` (and `LOG_QUERIES` is `True`), only zero-result queries are logged
`ICV_SEARCH_LOG_MODE`	`str`	`"individual"`	Logging strategy: `"individual"` writes per-query rows to `SearchQueryLog`, `"aggregate"` writes daily rollups to `SearchQueryAggregate`, `"both"` writes to both. See Search Analytics
`ICV_SEARCH_LOG_SAMPLE_RATE`	`float`	`1.0`	Fraction of individual `SearchQueryLog` rows to write (0.0–1.0). Only applies to `"individual"` and `"both"` modes. Aggregate counts are always recorded at 100% regardless of this setting
`ICV_SEARCH_CACHE_ENABLED`	`bool`	`False`	Enable search result caching via Django's cache framework
`ICV_SEARCH_CACHE_TIMEOUT`	`int`	`60`	Cache TTL in seconds for stored search results
`ICV_SEARCH_CACHE_ALIAS`	`str`	`"default"`	Django cache alias used by the search result cache

Auto-Indexing Configuration

ICV_SEARCH_AUTO_INDEX wires post_save and post_delete signal handlers automatically for any model you declare. The package's AppConfig.ready() reads this setting and connects the handlers on startup.

Each key in the dict is the logical index name. The value is a configuration dict with the following keys:

Key	Type	Default	Description
`model`	`str`	required	`"app_label.ModelName"` — the Django model to watch
`on_save`	`bool`	`True`	Index the document when the model instance is saved
`on_delete`	`bool`	`True`	Remove the document when the model instance is deleted
`async`	`bool`	from `ICV_SEARCH_ASYNC_INDEXING`	Override async behaviour for this index only
`auto_create`	`bool`	`True`	Create the `SearchIndex` record and engine index if they do not yet exist
`should_update`	`str`	`""`	Dotted path to a callable `(instance) -> bool`. When provided, the document is only indexed when the callable returns `True`
`updated_field`	`str`	`""`	Field name used to filter records for incremental reindexing (reserved for future use)

Example — multiple models:

ICV_SEARCH_AUTO_INDEX = {
    "articles": {
        "model": "blog.Article",
        "on_save": True,
        "on_delete": True,
        "async": True,
        "auto_create": True,
        "should_update": "blog.search.should_index_article",
    },
    "products": {
        "model": "catalogue.Product",
        "on_save": True,
        "on_delete": True,
        "async": False,  # Synchronous for this index
    },
}

# blog/search.py
def should_index_article(instance) -> bool:
    """Only index published articles."""
    return instance.published

SearchableMixin

Add SearchableMixin to any Django model to make it indexable. Declare the index configuration as class attributes.

from django.db import models
from icv_search.mixins import SearchableMixin

class Product(SearchableMixin, models.Model):
    # Required: the logical name of the search index
    search_index_name = "products"

    # Fields included in full-text search
    search_fields = ["name", "description", "sku"]

    # Fields that can be used in filter expressions
    search_filterable_fields = ["category_id", "is_active", "price"]

    # Fields that can be used in sort expressions
    search_sortable_fields = ["price", "created_at", "name"]

    name = models.CharField(max_length=200)
    description = models.TextField()
    sku = models.CharField(max_length=50, unique=True)
    category_id = models.IntegerField()
    price = models.DecimalField(max_digits=10, decimal_places=2)
    is_active = models.BooleanField(default=True)
    created_at = models.DateTimeField(auto_now_add=True)

Customising the document representation

Override to_search_document() to control exactly what is sent to the engine. The default implementation includes id and all fields listed in search_fields, converting dates to ISO strings and other non-primitive types to strings.

def to_search_document(self) -> dict:
    return {
        "id": str(self.id),
        "name": self.name,
        "description": self.description,
        "sku": self.sku,
        "price": float(self.price),  # Decimal -> float for JSON
        "category_id": self.category_id,
        "is_active": self.is_active,
        "category_name": self.category.name,  # Denormalised for search
    }

Customising the reindex queryset

Override get_search_queryset() to control which records are included in a full reindex and to add select_related / prefetch_related for performance.

@classmethod
def get_search_queryset(cls):
    return (
        cls.objects
        .filter(is_active=True)
        .select_related("category")
    )

Service API

Import service functions from icv_search.services:

from icv_search.services import (
    # Index management
    create_index, delete_index, update_index_settings,
    get_index_settings, get_index_stats,
    # Synonym / stop-word / typo management
    get_synonyms, update_synonyms, reset_synonyms,
    get_stop_words, update_stop_words, reset_stop_words,
    get_typo_tolerance, update_typo_tolerance,
    # Document operations
    index_documents, remove_documents,
    index_model_instances, reindex_all, reindex_zero_downtime,
    # Search
    search, multi_search, get_task,
    # Analytics
    get_popular_queries, get_zero_result_queries,
    get_search_stats, get_query_trend,
    clear_query_logs, clear_query_aggregates,
    # Utilities
    get_current_tenant_id, ICVSearchCache,
)

Index Management

`create_index`

def create_index(
    name: str,
    tenant_id: str = "",
    settings: dict | None = None,
    primary_key: str = "id",
    model_class: type | None = None,
) -> SearchIndex:

Creates a SearchIndex record, provisions the index in the engine, and pushes any settings. If model_class is provided and it uses SearchableMixin, its field lists seed the index settings automatically.

from icv_search.services import create_index
from myapp.models import Product

index = create_index(
    name="products",
    model_class=Product,            # Reads search_filterable_fields etc.
    settings={"rankingRules": ["words", "typo", "proximity"]},  # Overrides model
)

`delete_index`

def delete_index(name_or_index: str | SearchIndex, tenant_id: str = "") -> None:

Deletes the SearchIndex record from Django and removes the index from the engine. Raises SearchBackendError on engine failure.

from icv_search.services import delete_index
delete_index("products")

`update_index_settings`

def update_index_settings(
    name_or_index: str | SearchIndex,
    settings: dict,
    tenant_id: str = "",
) -> SearchIndex:

Merges settings into the existing index settings, saves to Django, and syncs to the engine.

from icv_search.services import update_index_settings
index = update_index_settings("products", {
    "synonyms": {"phone": ["mobile", "handset"]},
})

`get_index_stats`

def get_index_stats(name_or_index: str | SearchIndex, tenant_id: str = "") -> IndexStats:

Returns a normalised IndexStats dataclass with live data from the engine.

from icv_search.services import get_index_stats
stats = get_index_stats("products")
print(stats.document_count)
print(stats.is_indexing)

Document Operations

`index_documents`

def index_documents(
    name_or_index: str | SearchIndex,
    documents: list[dict],
    tenant_id: str = "",
    primary_key: str = "id",
) -> TaskResult:

Adds or updates documents in the search index. Returns a TaskResult.

from icv_search.services import index_documents
result = index_documents("products", [
    {"id": "abc123", "name": "Widget", "price": 9.99},
    {"id": "def456", "name": "Gadget", "price": 24.99},
])
print(result.task_uid)

`remove_documents`

def remove_documents(
    name_or_index: str | SearchIndex,
    document_ids: list[str],
    tenant_id: str = "",
) -> TaskResult:

Removes documents from the index by their primary key values.

from icv_search.services import remove_documents
remove_documents("products", ["abc123", "def456"])

`index_model_instances`

def index_model_instances(
    model_class: type,
    queryset=None,
    batch_size: int = 1000,
) -> int:

Indexes model instances using their SearchableMixin configuration. Iterates the queryset in batches to avoid loading the entire dataset into memory. Returns the total number of documents indexed.

from icv_search.services import index_model_instances
from myapp.models import Product

count = index_model_instances(Product, batch_size=500)
print(f"Indexed {count} products")

`reindex_all`

def reindex_all(
    name_or_index: str | SearchIndex,
    model_class: type,
    tenant_id: str = "",
    batch_size: int = 1000,
) -> int:

Full reindex: clears all existing documents, then re-indexes from the model's get_search_queryset(). Use index_model_instances instead if you do not want to clear first.

from icv_search.services import reindex_all
from myapp.models import Product

total = reindex_all("products", Product, batch_size=500)
print(f"Reindexed {total} products")

`reindex_zero_downtime`

def reindex_zero_downtime(
    name_or_index: str | SearchIndex,
    model_class: type,
    tenant_id: str = "",
    batch_size: int = 1000,
) -> int:

Zero-downtime reindex: creates a temporary index with the same settings, populates it from the model queryset, then atomically swaps it with the live index. The old index is deleted after the swap. Falls back to reindex_all() if the backend does not support index swaps.

from icv_search.services import reindex_zero_downtime
from myapp.models import Product

total = reindex_zero_downtime("products", Product, batch_size=500)

Search

Filters and sort orders use Django-native syntax — the service layer translates them to each engine's native format automatically. This means the same calling code works across all backends.

from icv_search.services import search

# Django-native filter dict
result = search("products", "padel", filter={"category": "equipment", "is_active": True})

# Django-native sort list (- prefix = descending)
result = search("products", "", sort=["-price", "name"])

# Combined
result = search("products", "padel",
    filter={"city": "Madrid", "is_active": True},
    sort=["-created_at"],
    limit=10,
)

`search`

def search(
    name_or_index: str | SearchIndex,
    query: str,
    tenant_id: str = "",
    **params,
) -> SearchResult:

Executes a search query and returns a normalised SearchResult. Additional keyword arguments are passed to the engine (e.g. filter, sort, limit, offset, facets).

from icv_search.services import search

# Basic search
results = search("products", "widget")

# With Django-native filter dict and sort list
results = search(
    "products",
    "widget",
    filter={"is_active": True, "price__lt": 50},
    sort=["-price"],
    limit=20,
    offset=0,
)

for hit in results.hits:
    print(hit["name"], hit["price"])

print(f"About {results.estimated_total_hits} results")

Pagination

ICVSearchPaginator wraps a SearchResult for use with Django's pagination machinery (ListView, templates). It uses estimated_total_hits as the count instead of running a separate queryset.count() query.

from icv_search import ICVSearchPaginator
from icv_search.services import search

# In a view
page_number = int(request.GET.get("page", 1))
per_page = 25
result = search("products", query, limit=per_page, offset=(page_number - 1) * per_page)

paginator = ICVSearchPaginator(result, per_page=per_page)
page_obj = paginator.get_page(page_number)

# In a template
{% for hit in page_obj %}
    {{ hit.name }}
{% endfor %}

{% if page_obj.is_estimated %}
    {{ page_obj.display_count }} results
{% else %}
    {{ page_obj.paginator.count }} results
{% endif %}

Facets

When requesting facets from the search engine, the normalised facet_distribution is available directly on SearchResult:

result = search("products", "shoes", facets=["brand", "colour"])
print(result.facet_distribution)
# {"brand": {"Nike": 42, "Adidas": 31}, "colour": {"black": 55, "white": 28}}

# Convenience helper — sorted by count descending
for facet in result.get_facet_values("brand"):
    print(f"{facet['name']}: {facet['count']}")

Range Filters

Use Django-style lookup suffixes for numeric range queries:

result = search("products", "",
    filter={"price__gte": 10, "price__lte": 100, "is_active": True},
)

Supported suffixes: __gte (>=), __gt (>), __lte (<=), __lt (<). Works across all backends.

Bulk Operations

`skip_index_update`

A context manager that temporarily disables auto-indexing signal handlers. Use this in bulk imports, data migrations, and test factories to avoid triggering individual index updates for every save() call.

from icv_search.auto_index import skip_index_update
from myapp.models import Article

articles = [Article(title=f"Article {i}") for i in range(1000)]

with skip_index_update():
    Article.objects.bulk_create(articles)
    # No search index updates during this block

# Trigger a single reindex after the bulk operation
from icv_search.services import reindex_all
reindex_all("articles", Article)

The context manager is nestable. Auto-indexing resumes when the outermost with block exits.

Highlighting

Pass highlight_fields to get highlighted versions of matching text:

result = search("articles", "django tips",
    highlight_fields=["title", "body"],
    highlight_pre_tag="<mark>",   # default
    highlight_post_tag="</mark>",  # default
)

# Highlighted versions of each hit
for hit in result.get_highlighted_hits():
    print(hit["title"])  # "...about <mark>Django</mark> <mark>tips</mark>..."

# Or access directly
result.formatted_hits  # list of highlighted hit dicts

Works across all backends: Meilisearch uses native _formatted, PostgreSQL uses ts_headline(), DummyBackend wraps matching substrings.

Ranking Scores

Request ranking scores to understand result relevance:

result = search("products", "shoes", show_ranking_score=True)

for i, hit in enumerate(result.hits):
    hit, score = result.get_hit_with_score(i)
    print(f"{hit['name']}: {score:.2f}")

Meilisearch returns _rankingScore (0.0–1.0), PostgreSQL uses ts_rank, and DummyBackend computes a simple term-frequency score.

Geo-Distance Search

Filter and sort results by geographic distance:

# Find restaurants within 5km of a point
result = search("restaurants", "",
    geo_point=(51.5074, -0.1278),  # London (lat, lng)
    geo_radius=5000,                # metres
    geo_sort="asc",                 # nearest first
)

for hit in result.hits:
    print(f"{hit['name']}: {hit.get('_geoDistance')}m away")

Models with geo data should declare it on the mixin:

class Restaurant(SearchableMixin, models.Model):
    search_index_name = "restaurants"
    search_fields = ["name", "cuisine"]
    search_lat_field = "latitude"
    search_lng_field = "longitude"

    latitude = models.FloatField()
    longitude = models.FloatField()

Multi-Search

Execute multiple queries in a single request:

from icv_search.services import multi_search

results = multi_search([
    {"index_name": "products", "query": "shoes", "limit": 5},
    {"index_name": "articles", "query": "shoes", "limit": 3, "facets": ["category"]},
])

product_results, article_results = results

Meilisearch uses the native POST /multi-search endpoint. Other backends execute queries sequentially.

Synonym and Stop-Word Management

from icv_search.services import (
    get_synonyms, update_synonyms, reset_synonyms,
    get_stop_words, update_stop_words, reset_stop_words,
    get_typo_tolerance, update_typo_tolerance,
)

# Synonyms
update_synonyms("products", {"phone": ["mobile", "handset"], "laptop": ["notebook"]})
print(get_synonyms("products"))
reset_synonyms("products")

# Stop words
update_stop_words("products", ["the", "a", "an", "is"])
print(get_stop_words("products"))

# Typo tolerance
update_typo_tolerance("products", {"enabled": True, "minWordSizeForTypos": {"oneTypo": 4}})

SearchQuery Builder

A fluent API for building search queries:

from icv_search import SearchQuery

results = (
    SearchQuery("products")
    .text("running shoes")
    .filter(brand="Nike", price__gte=50)
    .sort("-price", "name")
    .facets("brand", "category")
    .highlight("name", "description")
    .geo_near(lat=51.5, lng=-0.12, radius=5000)
    .with_ranking_scores()
    .limit(20)
    .execute()
)

# Or get a paginator directly
paginator = SearchQuery("products").text("shoes").limit(25).paginate()
page = paginator.get_page(1)

Search Analytics

Enable query logging to track search behaviour:

# settings.py
ICV_SEARCH_LOG_QUERIES = True
ICV_SEARCH_LOG_ZERO_RESULTS_ONLY = False  # Set True to reduce storage

Logging strategies

ICV_SEARCH_LOG_MODE controls how queries are recorded:

Mode	Storage	Best for
`"individual"` (default)	One `SearchQueryLog` row per query	Low/medium traffic sites that need full query history
`"aggregate"`	Daily rollups in `SearchQueryAggregate`	High-traffic sites where individual rows would be too large
`"both"`	Both individual rows and daily rollups	Sites that want detailed logs for recent queries plus long-term trends

# settings.py — high-traffic configuration
ICV_SEARCH_LOG_QUERIES = True
ICV_SEARCH_LOG_MODE = "aggregate"        # Daily rollups only
ICV_SEARCH_LOG_SAMPLE_RATE = 1.0         # Not applicable in aggregate-only mode

# Or keep individual logs with sampling
ICV_SEARCH_LOG_MODE = "both"
ICV_SEARCH_LOG_SAMPLE_RATE = 0.1         # Write only 10% of individual rows

Aggregate queries are normalised (stripped and lowercased) for consistent grouping. The sample rate only affects SearchQueryLog rows — aggregate counts always reflect 100% of queries.

Analytics service functions

from icv_search.services import (
    get_popular_queries,
    get_zero_result_queries,
    get_search_stats,
    get_query_trend,
    clear_query_logs,
    clear_query_aggregates,
)

# Most frequent queries in the last 7 days
popular = get_popular_queries("products", days=7, limit=20)

# Queries that returned no results
gaps = get_zero_result_queries("products", days=7)

# Aggregate stats
stats = get_search_stats("products", days=7)
# {"total_queries": 1234, "avg_processing_time_ms": 12, "zero_result_rate": 0.05}

# Day-by-day trend for a specific query (reads from SearchQueryAggregate)
trend = get_query_trend("running shoes", "products", days=30)
# [{"date": date(2026, 3, 1), "count": 42, "zero_result_count": 3, "avg_processing_time_ms": 8.5}, ...]

# Cleanup
deleted = clear_query_logs(days_older_than=30)
deleted = clear_query_aggregates(days_older_than=90)

All analytics functions (get_popular_queries, get_zero_result_queries, get_search_stats) automatically read from the correct model based on ICV_SEARCH_LOG_MODE.

Tenant Middleware

Auto-inject tenant context from the request instead of passing tenant_id on every call:

# settings.py
MIDDLEWARE = [
    # ...
    "icv_search.middleware.ICVSearchTenantMiddleware",
]
ICV_SEARCH_TENANT_PREFIX_FUNC = "myproject.search.get_tenant_prefix"

# In a view — tenant_id is injected automatically
results = search("products", "widget")  # No tenant_id needed

# Explicit tenant_id always takes precedence
results = search("products", "widget", tenant_id="other_tenant")

Search Result Cache

Enable caching to reduce backend load for repeated queries:

# settings.py
ICV_SEARCH_CACHE_ENABLED = True
ICV_SEARCH_CACHE_TIMEOUT = 60         # seconds
ICV_SEARCH_CACHE_ALIAS = "default"    # Django cache alias

Cache is automatically invalidated when documents are indexed or removed. Queries with a user param bypass the cache (analytics-aware searches may vary by user).

Response Types

All service functions return normalised dataclasses, insulating your code from engine-specific response shapes.

`TaskResult`

Returned by document indexing and deletion operations.

@dataclass
class TaskResult:
    task_uid: str        # Engine-assigned task identifier
    status: str          # Task status (e.g. "enqueued", "succeeded")
    detail: str          # Operation type or description
    raw: dict            # Original engine response

`SearchResult`

Returned by search().

@dataclass
class SearchResult:
    hits: list[dict]         # Matching documents
    query: str               # The query string as echoed by the engine
    processing_time_ms: int  # Time taken by the engine (milliseconds)
    estimated_total_hits: int  # Approximate total matching documents
    limit: int               # Page size applied
    offset: int              # Offset applied
    facet_distribution: dict[str, dict[str, int]]  # Facet counts by field
    formatted_hits: list[dict]  # Highlighted versions of hits
    ranking_scores: list[float | None]  # Relevance scores per hit
    raw: dict                # Original engine response

    def get_highlighted_hits(self) -> list[dict]: ...
    def get_facet_values(facet_name: str) -> list[dict]: ...
    def get_hit_with_score(index: int) -> tuple[dict, float | None]: ...

`IndexStats`

Returned by get_index_stats().

@dataclass
class IndexStats:
    document_count: int              # Number of indexed documents
    is_indexing: bool                # Whether the engine is currently indexing
    field_distribution: dict[str, int]  # Field name -> document count
    raw: dict                        # Original engine response

Backends

Meilisearch (default)

Requires a running Meilisearch instance (v1.0+). Uses httpx directly rather than the official SDK, keeping dependencies minimal.

Required settings:

ICV_SEARCH_BACKEND = "icv_search.backends.meilisearch.MeilisearchBackend"
ICV_SEARCH_URL = "http://localhost:7700"
ICV_SEARCH_API_KEY = "your-master-key"  # Leave blank if no auth configured

PostgreSQL (zero infrastructure)

Uses Django's built-in django.contrib.postgres.search to provide full-text search without any external services. Documents are stored in PostgreSQL tables with tsvector indexing.

ICV_SEARCH_BACKEND = "icv_search.backends.postgres.PostgresBackend"
# ICV_SEARCH_URL and ICV_SEARCH_API_KEY are ignored by this backend.

The backend automatically creates its tables on first use — no additional migrations required. Supports:

Full-text search with ranking (ts_rank)
Django-native filter dicts
Django-native sort lists
searchableAttributes from index settings

Best for projects that want search without running Meilisearch, or as a starting point before upgrading to a dedicated search engine.

DummyBackend (testing)

An in-memory backend that stores documents in module-level dicts. No running search engine required. Supports basic substring search, limit, and offset.

# tests/settings.py
ICV_SEARCH_BACKEND = "icv_search.backends.dummy.DummyBackend"
ICV_SEARCH_ASYNC_INDEXING = False  # Keep tests synchronous

See the Testing section for fixtures and helpers.

Writing a Custom Backend

Subclass BaseSearchBackend and implement all abstract methods. Point ICV_SEARCH_BACKEND at the dotted path to your class.

# myproject/search_backends.py
from icv_search.backends.base import BaseSearchBackend

class TypesenseBackend(BaseSearchBackend):

    def __init__(self, url: str, api_key: str, timeout: int = 30, **kwargs):
        super().__init__(url=url, api_key=api_key, timeout=timeout, **kwargs)
        # Initialise your HTTP client here

    def create_index(self, uid: str, primary_key: str = "id") -> dict: ...
    def delete_index(self, uid: str) -> None: ...
    def update_settings(self, uid: str, settings: dict) -> dict: ...
    def get_settings(self, uid: str) -> dict: ...
    def add_documents(self, uid: str, documents: list[dict], primary_key: str = "id") -> dict: ...
    def delete_documents(self, uid: str, document_ids: list[str]) -> dict: ...
    def clear_documents(self, uid: str) -> dict: ...
    def search(self, uid: str, query: str, **params) -> dict: ...
    def get_stats(self, uid: str) -> dict: ...
    def health(self) -> bool: ...

# settings.py
ICV_SEARCH_BACKEND = "myproject.search_backends.TypesenseBackend"

Raise icv_search.exceptions.SearchBackendError on failure so the service layer handles errors consistently.

Management Commands

Command	Purpose
`icv_search_setup [--dry-run]`	Recommended first step. Creates `SearchIndex` records for all entries in `ICV_SEARCH_AUTO_INDEX`, syncs settings to the engine, and verifies connectivity. Use `--dry-run` to preview without making changes
`icv_search_health [--verbose]`	Check engine connectivity; `--verbose` prints per-index document counts and sync status
`icv_search_sync [--index NAME] [--force] [--tenant TENANT]`	Push index settings from Django to the engine; without `--force`, skips indexes already marked as synced
`icv_search_reindex --index NAME --model DOTTED.PATH [--batch-size N] [--tenant TENANT]`	Clear and re-index from `get_search_queryset()` in batches (default 1000)
`icv_search_create_index --name NAME [--primary-key FIELD] [--tenant TENANT]`	Create a `SearchIndex` record and provision it in the engine
`icv_search_clear --index NAME [--tenant TENANT]`	Remove all documents from an index without deleting it

# First-time setup — creates all indexes from ICV_SEARCH_AUTO_INDEX
python manage.py icv_search_setup

# Preview what would be created
python manage.py icv_search_setup --dry-run

# Other commands
python manage.py icv_search_health --verbose
python manage.py icv_search_sync --index products --force
python manage.py icv_search_reindex --index products --model myapp.models.Product --batch-size 500
python manage.py icv_search_create_index --name orders --primary-key order_id
python manage.py icv_search_clear --index products

Note: SearchIndex records are also auto-created on first use — calling search("products", "shoes") will create the SearchIndex record automatically if it does not exist. The icv_search_setup command is the recommended way to provision indexes explicitly during deployment.

Celery Tasks

Celery is optional. When not installed the shared_task decorator is replaced with a no-op and all operations run synchronously. When installed with ICV_SEARCH_ASYNC_INDEXING = True, operations are dispatched as background tasks with exponential backoff (max three retries).

Task	Signature	Purpose
`sync_index_settings`	`(index_pk)`	Push settings for one index
`sync_all_indexes`	`()`	Sync all unsynced active indexes (periodic, every 5 min)
`add_documents`	`(index_pk, documents, primary_key="id")`	Add/update documents
`remove_documents`	`(index_pk, document_ids)`	Remove documents
`reindex`	`(index_pk, model_path, batch_size=1000)`	Full reindex from model queryset
`refresh_document_counts`	`()`	Refresh cached `document_count` from engine stats (periodic, hourly)
`reindex_zero_downtime_task`	`(index_pk, model_path, batch_size=1000)`	Zero-downtime reindex via index swap
`flush_debounce_buffer`	`(index_pk)`	Drain debounce buffer and batch-index buffered documents
`cleanup_search_query_logs`	`(days_older_than=30)`	Delete old search query log entries (periodic, daily)
`cleanup_search_query_aggregates`	`(days_older_than=90)`	Delete old search query aggregate rows (periodic, daily)

# Celery Beat schedule
from celery.schedules import crontab
CELERY_BEAT_SCHEDULE = {
    "icv-search-sync-all": {
        "task": "icv_search.tasks.sync_all_indexes",
        "schedule": crontab(minute="*/5"),
    },
    "icv-search-refresh-counts": {
        "task": "icv_search.tasks.refresh_document_counts",
        "schedule": crontab(minute=0),
    },
    "icv-search-cleanup-query-logs": {
        "task": "icv_search.tasks.cleanup_search_query_logs",
        "schedule": crontab(hour=3, minute=0),  # daily at 03:00
    },
    "icv-search-cleanup-query-aggregates": {
        "task": "icv_search.tasks.cleanup_search_query_aggregates",
        "schedule": crontab(hour=3, minute=15),  # daily at 03:15
    },
}

Signals

All signals are defined in icv_search.signals. Connect in your consuming project to react to search index lifecycle events.

Signal	Sender	Kwargs	When
`search_index_created`	`SearchIndex`	`instance`	After a new index is created and provisioned
`search_index_deleted`	`SearchIndex`	`instance`	After an index is deleted from Django and the engine
`search_index_synced`	`SearchIndex`	`instance`	After settings are pushed to the engine successfully
`documents_indexed`	`SearchIndex`	`instance`, `count`, `document_ids`	After documents are added or updated
`documents_removed`	`SearchIndex`	`instance`, `count`, `document_ids`	After documents are removed

from django.dispatch import receiver
from icv_search.signals import documents_indexed
from icv_search.models import SearchIndex

@receiver(documents_indexed, sender=SearchIndex)
def on_documents_indexed(sender, instance, count, document_ids, **kwargs):
    print(f"{count} documents indexed in '{instance.name}'")

Testing

Using DummyBackend

Configure the dummy backend in your test settings:

# tests/settings.py
ICV_SEARCH_BACKEND = "icv_search.backends.dummy.DummyBackend"
ICV_SEARCH_ASYNC_INDEXING = False  # Synchronous so assertions work immediately

Test Fixtures

icv_search.testing provides ready-made fixtures and factories:

# conftest.py
from icv_search.testing.fixtures import search_backend, search_index  # noqa: F401

Fixture	What it does
`search_backend`	Configures `DummyBackend`, resets state before and after the test
`search_index`	Creates a `SearchIndex` instance via `SearchIndexFactory`

Factories (icv_search.testing.factories):

Factory	Model
`SearchIndexFactory`	`SearchIndex`
`IndexSyncLogFactory`	`IndexSyncLog`
`SearchQueryAggregateFactory`	`SearchQueryAggregate`

Asserting Documents

Inspect the DummyBackend's in-memory state directly:

from icv_search.backends.dummy import _documents

def test_article_is_indexed(db, search_backend):
    article = ArticleFactory()

    # After save, the document should be in the dummy backend
    docs = _documents.get(article.search_index_name, {})
    assert str(article.pk) in docs
    assert docs[str(article.pk)]["title"] == article.title

Use the provided helper functions for common assertions:

from icv_search.testing.helpers import (
    get_indexed_documents,
    get_dummy_indexes,
    assert_document_indexed,
)

def test_product_indexed(db, search_backend):
    product = ProductFactory()
    assert_document_indexed("products", str(product.pk))

def test_all_products_indexed(db, search_backend):
    ProductFactory.create_batch(5)
    docs = get_indexed_documents("products")
    assert len(docs) == 5

`skip_index_update` in Tests

Use skip_index_update() in test factories and fixtures to prevent auto-index noise when creating supporting data that is not the subject of the test:

# tests/factories.py
import factory
from icv_search.auto_index import skip_index_update

class ArticleFactory(factory.django.DjangoModelFactory):
    class Meta:
        model = Article

    @classmethod
    def _create(cls, model_class, *args, **kwargs):
        with skip_index_update():
            return super()._create(model_class, *args, **kwargs)

Multi-Tenancy

In a multi-tenant application, each tenant's search index is distinguished by a prefix on the engine_uid. The tenant_id is stored as a plain CharField on SearchIndex — there is no foreign key to a tenant model, so icv-search has no dependency on any specific tenant implementation.

Configure the prefix callable:

# myproject/search.py
def get_tenant_prefix(request_or_none) -> str:
    """Return the current tenant's slug for use as an index prefix."""
    if request_or_none and hasattr(request_or_none, "tenant"):
        return request_or_none.tenant.slug
    return ""

# settings.py
ICV_SEARCH_TENANT_PREFIX_FUNC = "myproject.search.get_tenant_prefix"

How engine_uid is computed:

engine_uid = {ICV_SEARCH_INDEX_PREFIX}{tenant_id}_{name}
           = "staging_acme_products"   (prefix="staging_", tenant="acme", name="products")
           = "acme_products"           (no prefix, tenant="acme", name="products")
           = "products"               (single-tenant — no prefix, no tenant)

At save time, the callable is invoked with None as the request argument (there is no HTTP request context during a model save). For request-scoped prefix resolution, pass tenant_id explicitly when calling service functions:

from icv_search.services import search
results = search("products", "widget", tenant_id=request.tenant.slug)

Omit ICV_SEARCH_TENANT_PREFIX_FUNC (leave it as "") for single-tenant deployments — all indexes exist in a flat namespace.

Roadmap

SQLite FTS5 backend
MySQL FULLTEXT backend
Async (httpx.AsyncClient) support for ASGI applications
Typesense backend
Search result click-through tracking
A/B testing for ranking rules
PostGIS-backed geo search (production-grade alternative to Haversine)

Licence

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.1.1

Apr 20, 2026

1.1.0

Apr 20, 2026

1.0.0

Apr 16, 2026

1.0.0b1 pre-release

Apr 7, 2026

0.10.0

Apr 4, 2026

0.7.0

Mar 26, 2026

0.6.0

Mar 25, 2026

0.5.0

Mar 24, 2026

0.4.0

Mar 23, 2026

This version

0.3.3

Mar 22, 2026

0.3.2

Mar 22, 2026

0.3.1

Mar 22, 2026

0.3.0

Mar 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django_icv_search-0.3.3.tar.gz (150.6 kB view details)

Uploaded Mar 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

django_icv_search-0.3.3-py3-none-any.whl (94.0 kB view details)

Uploaded Mar 22, 2026 Python 3

File details

Details for the file django_icv_search-0.3.3.tar.gz.

File metadata

Download URL: django_icv_search-0.3.3.tar.gz
Upload date: Mar 22, 2026
Size: 150.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for django_icv_search-0.3.3.tar.gz
Algorithm	Hash digest
SHA256	`7279215f0a551e98eed2330b38cb4c22d7337195f6870c12060b72bab871adaa`
MD5	`71b989770e56973774edec7c84101e01`
BLAKE2b-256	`396f30b6f9e51924e19710832fdad6a30378363efcb6578664af5a53fa940555`

See more details on using hashes here.

Provenance

The following attestation bundles were made for django_icv_search-0.3.3.tar.gz:

Publisher: publish-search.yml on nigelcopley/icv-django

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: django_icv_search-0.3.3.tar.gz
- Subject digest: 7279215f0a551e98eed2330b38cb4c22d7337195f6870c12060b72bab871adaa
- Sigstore transparency entry: 1155171311
- Sigstore integration time: Mar 22, 2026
Source repository:
- Permalink: nigelcopley/icv-django@834d2065ee8f84a3fb7c4d561920c19eac4ca7db
- Branch / Tag: refs/tags/icv-search/v0.3.3
- Owner: https://github.com/nigelcopley
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-search.yml@834d2065ee8f84a3fb7c4d561920c19eac4ca7db
- Trigger Event: push

File details

Details for the file django_icv_search-0.3.3-py3-none-any.whl.

File metadata

Download URL: django_icv_search-0.3.3-py3-none-any.whl
Upload date: Mar 22, 2026
Size: 94.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for django_icv_search-0.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3702face80184b4f6ef93cd0de355080e9161b32290f5990d15e81a3e6a152f3`
MD5	`5bdab61eca079d66ef68cd9f73e4d7e9`
BLAKE2b-256	`5c2ad4deccb45cb579533fb61aced1d0298586f96beb7a9cd6d299327d2c9abc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for django_icv_search-0.3.3-py3-none-any.whl:

Publisher: publish-search.yml on nigelcopley/icv-django

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: django_icv_search-0.3.3-py3-none-any.whl
- Subject digest: 3702face80184b4f6ef93cd0de355080e9161b32290f5990d15e81a3e6a152f3
- Sigstore transparency entry: 1155171316
- Sigstore integration time: Mar 22, 2026
Source repository:
- Permalink: nigelcopley/icv-django@834d2065ee8f84a3fb7c4d561920c19eac4ca7db
- Branch / Tag: refs/tags/icv-search/v0.3.3
- Owner: https://github.com/nigelcopley
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-search.yml@834d2065ee8f84a3fb7c4d561920c19eac4ca7db
- Trigger Event: push

django-icv-search 0.3.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

django-icv-search

Features

Installation

Basic (standalone)

With icv-core

Quick Start

Configuration

Settings Reference

Auto-Indexing Configuration

SearchableMixin

Customising the document representation

Customising the reindex queryset

Service API

Index Management

create_index

delete_index

update_index_settings

get_index_stats

Document Operations

index_documents

remove_documents

index_model_instances

reindex_all

reindex_zero_downtime

Search

search

Pagination

Facets

Range Filters

Bulk Operations

skip_index_update

Highlighting

Ranking Scores

Geo-Distance Search

Multi-Search

Synonym and Stop-Word Management

SearchQuery Builder

Search Analytics

Logging strategies

Analytics service functions

Tenant Middleware

Search Result Cache

Response Types

TaskResult

SearchResult

IndexStats

Backends

Meilisearch (default)

PostgreSQL (zero infrastructure)

DummyBackend (testing)

Writing a Custom Backend

Management Commands

Celery Tasks

Signals

Testing

Using DummyBackend

Test Fixtures

Asserting Documents

skip_index_update in Tests

Multi-Tenancy

Roadmap

Licence

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

`create_index`

`delete_index`

`update_index_settings`

`get_index_stats`

`index_documents`

`remove_documents`

`index_model_instances`

`reindex_all`

`reindex_zero_downtime`

`search`

`skip_index_update`

`TaskResult`

`SearchResult`

`IndexStats`

`skip_index_update` in Tests