Skip to main content

A lightweight, dependency-free Elasticsearch query builder for Python

Project description

elastic-query-builder

A lightweight, zero-dependency Python library for building Elasticsearch queries using the builder pattern.

PyPI version Python License: MIT Tests

Build Elasticsearch queries in Python with type safety, method chaining, and zero external dependencies. The library generates plain dict output that works seamlessly with any Elasticsearch client.


Features

  • Zero dependencies -- pure Python only, no external packages required
  • Builder pattern with a fluent, chainable API
  • Full type hints for IDE autocomplete and static analysis
  • Comprehensive query support:
    • Leaf queries: Bool, Term, Terms, Match, MatchPhrase, Range, Wildcard, Exists, IDs, MatchAll, MatchNone
    • Compound queries: DisMax, Nested
    • Span queries: SpanTerm, SpanNear
  • Aggregations:
    • Bucket: Terms, DateHistogram, Histogram, Range, Filter, Filters, Nested
    • Metric: Sum, Avg, Min, Max, Stats, Cardinality, TopHits
  • Sort builder with field sorting, score sorting, and script sorting
  • Source filtering with includes/excludes control
  • Generates plain dict -- use with elasticsearch-py, opensearch-py, or any HTTP client

Installation

pip install elastic-query-builder

Quick Start

from elastic_query_builder import QueryBuilder

qb = QueryBuilder()
query = (
    qb.add_must(QueryBuilder.Match.build("title", "elasticsearch"))
      .add_filter(QueryBuilder.Range.build("date", gte="2024-01-01"))
      .set_size(10)
      .build()
)

# Use with any ES client
# es.search(index="my-index", body=query)

Usage Examples

Basic Search

from elastic_query_builder import QueryBuilder

qb = QueryBuilder()
query = (
    qb.add_must(QueryBuilder.Match.build("title", "검색어", operator="and"))
      .add_filter(QueryBuilder.Range.build("date", gte="2024-01-01"))
      .set_size(10)
      .build()
)

Leaf Queries (Standalone)

Each query type can be used independently to generate its corresponding ES query dict:

from elastic_query_builder.query.leaf import TermQuery, MatchQuery, RangeQuery

# Term query
term = TermQuery.build("status", "active")
# {"term": {"status": {"value": "active"}}}

# Match query with options
match = MatchQuery.build("title", "search terms", boost=2.0, operator="and")
# {"match": {"title": {"query": "search terms", "boost": 2.0, "operator": "and"}}}

# Range query
range_q = RangeQuery.build("price", gte=100, lte=500)
# {"range": {"price": {"gte": 100, "lte": 500}}}

Bool Query with Nested Bool

from elastic_query_builder import QueryBuilder

qb = QueryBuilder()

# Create an inner bool query
inner_bool = qb.nested_bool()
inner_bool.add_should(QueryBuilder.Match.build("title", "keyword"))
inner_bool.add_should(QueryBuilder.Match.build("content", "keyword"))
inner_bool.add_minimum_should_match(1)

# Nest it inside the outer bool query
query = (
    qb.add_must(inner_bool.build())
      .add_filter(QueryBuilder.Term.build("status", "published"))
      .set_size(20)
      .build()
)

Aggregations

Basic aggregations:

qb = QueryBuilder()
query = (
    qb.set_match_all()
      .set_size(0)
      .add_terms_agg("status_count", "status", size=10)
      .add_date_histogram_agg("monthly", "created_at", calendar_interval="1M")
      .build()
)

Nested aggregations:

qb = QueryBuilder()

sub_agg = qb.nested_agg()
sub_agg.add_terms("item_names", "items.name", size=5)

query = (
    qb.set_match_all()
      .set_size(0)
      .add_nested_agg("items_agg", "items", sub_agg.build())
      .build()
)

Sorting

from elastic_query_builder import QueryBuilder
from elastic_query_builder.core.enums import SortOrder, SortMissing

qb = QueryBuilder()
query = (
    qb.set_match_all()
      .add_sort("date", SortOrder.DESC)
      .add_sort("name", SortOrder.ASC, missing=SortMissing.LAST)
      .add_score_sort()
      .set_size(50)
      .build()
)

Source Filtering

qb = QueryBuilder()

# Include specific fields only
query = qb.set_match_all().set_source_includes(["title", "date"]).build()

# Add excludes
query = qb.set_match_all().add_source_excludes("content", "metadata").build()

# Disable _source entirely
query = qb.set_match_all().set_source(False).build()

Span Queries (Proximity Search)

qb = QueryBuilder()
query = (
    qb.add_must(
        QueryBuilder.SpanNear.build(
            clauses=[
                QueryBuilder.SpanTerm.build("content", "artificial"),
                QueryBuilder.SpanTerm.build("content", "intelligence"),
            ],
            slop=3,
            in_order=True
        )
    )
    .build()
)

Real-World Example: Patent Search

from elastic_query_builder import QueryBuilder
from elastic_query_builder.core.enums import SortOrder

qb = QueryBuilder()

# Search conditions
qb.add_must(QueryBuilder.Match.build("productKor", "반도체", operator="and"))
qb.add_must(QueryBuilder.Match.build("abstract", "발광 다이오드", boost=2.0))

# Filters
qb.add_filter(QueryBuilder.Range.build("applicationDate", gte="20200101", lte="20241231"))
qb.add_filter(QueryBuilder.Term.build("statusCode", "registered"))

# Exclusions
qb.add_must_not(QueryBuilder.Term.build("applicantName", "테스트"))

# Sort + Pagination
qb.add_sort("applicationDate", SortOrder.DESC)
qb.set_size(20)
qb.set_from(0)
qb.set_track_total_hits(True)

# Source filtering
qb.set_source_includes(["applicationNumber", "productKor", "applicantName", "applicationDate"])

query = qb.build()

API Reference

For full API documentation, see the docs/ folder:


Architecture

The library follows a layered architecture with clear domain separation:

elastic_query_builder/
├── core/                  # Enums (SortOrder, SortMissing, BoolClause) and type aliases
├── query/                 # Search Query domain
│   ├── leaf/              #   Term, Match, Range, Wildcard, Exists, IDs, MatchAll, MatchNone
│   ├── compound/          #   BoolQueryBuilder, DisMaxQuery
│   ├── span/              #   SpanTermQuery, SpanNearQuery
│   └── nested.py          #   NestedQuery
├── aggregation/           # Aggregation domain
│   ├── bucket/            #   Terms, DateHistogram, Histogram, Range, Filter, Filters, Nested
│   ├── metric/            #   Sum, Avg, Min, Max, Stats, Cardinality, TopHits
│   └── aggregation_builder.py
├── sort/                  # Sort domain (SortBuilder)
└── builder.py             # QueryBuilder -- top-level integration builder

Design principles:

  1. Each query class is responsible for generating only its own ES query (single responsibility)
  2. build() always returns Dict[str, Any] (consistent interface)
  3. Builder methods return self to support method chaining
  4. None values are excluded from generated queries (clean output)
  5. No external dependencies -- pure Python only

Contributing

Contributions are welcome! Here is how to get started:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/my-feature)
  3. Write tests for your changes
  4. Ensure all tests pass (pytest)
  5. Commit your changes (git commit -m "Add my feature")
  6. Push to your branch (git push origin feature/my-feature)
  7. Open a Pull Request

Please make sure to:

  • Follow the existing code style
  • Add type hints to all public methods
  • Include tests for new functionality
  • Update documentation as needed

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elastic_query_builder-0.1.0.tar.gz (71.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

elastic_query_builder-0.1.0-py3-none-any.whl (58.8 kB view details)

Uploaded Python 3

File details

Details for the file elastic_query_builder-0.1.0.tar.gz.

File metadata

  • Download URL: elastic_query_builder-0.1.0.tar.gz
  • Upload date:
  • Size: 71.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for elastic_query_builder-0.1.0.tar.gz
Algorithm Hash digest
SHA256 53a0c1d4b1153595f081aa8a975b4e0a5fd7e8ab11f944bba1da8c0973a48b8a
MD5 c364ead42767063125bbff163a081250
BLAKE2b-256 e43c8cb90f3a2be85fd648240b353d556004f84b79b09554033f2c0f586cbfd4

See more details on using hashes here.

Provenance

The following attestation bundles were made for elastic_query_builder-0.1.0.tar.gz:

Publisher: publish.yml on junhyeong9812/python-elastic-query-builder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file elastic_query_builder-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for elastic_query_builder-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 670f6dc53d007faf11d3aef86f1d50dc66b24c0036542a8c9e3649d57c18d10d
MD5 78637fbb4985cb8983e942687291ce4e
BLAKE2b-256 aaddf2bdb07bc6a66cf397a4deb20d4eeae02af0b9e67d95bfd60381efad2a4c

See more details on using hashes here.

Provenance

The following attestation bundles were made for elastic_query_builder-0.1.0-py3-none-any.whl:

Publisher: publish.yml on junhyeong9812/python-elastic-query-builder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page