Skip to main content

A lightweight, dependency-free Elasticsearch query builder for Python

Project description

elastic-query-builder

Build Elasticsearch queries you can actually read.

PyPI version Python License: MIT

A lightweight, zero-dependency Python library that brings Java's QueryBuilders pattern to Python. Method chaining, explicit bool clause control, and plain dict output — no magic, no hidden abstractions.


Why This Exists

The raw dict problem

Building ES queries in Python usually starts like this:

query = {"query": {"bool": {"must": [], "filter": []}}}

if keyword:
    query["query"]["bool"]["must"].append(
        {"match": {"title": {"query": keyword, "operator": "and"}}}
    )

if date_from or date_to:
    range_q = {"range": {"date": {}}}
    if date_from:
        range_q["range"]["date"]["gte"] = date_from
    if date_to:
        range_q["range"]["date"]["lte"] = date_to
    query["query"]["bool"]["filter"].append(range_q)

if status:
    query["query"]["bool"]["filter"].append(
        {"term": {"status": status}}
    )

if exclude_test:
    query["query"]["bool"].setdefault("must_not", []).append(
        {"term": {"applicant": "test"}}
    )

query["size"] = 20
query["sort"] = [{"date": {"order": "desc"}}]

When you have 10–20 fields with dynamic conditions, this becomes unreadable fast. Bracket nesting, setdefault calls, manual list appends — hard to review, easy to break.

The elasticsearch-dsl approach

The official elasticsearch-dsl wraps this in its own abstraction. Simple cases look clean:

s = Search().query("match", title="python").filter("term", status="active")

But when you need nested bool queries with mixed clauses, it forces deeply nested constructor calls:

s = Search()
s = s.query(
    Q('bool',
        must=[
            Q('match', title={'query': 'semiconductor', 'operator': 'and'}),
            Q('bool',
                should=[
                    Q('match', content='LED'),
                    Q('match_phrase', abstract='light emitting diode'),
                ],
                minimum_should_match=1
            ),
        ],
        filter=[
            Q('range', application_date={'gte': '20200101', 'lte': '20241231'}),
            Q('term', status_code='registered'),
        ],
        must_not=[Q('term', applicant_name='test')]
    )
)

The Q() wrappers become noise. You still have to mentally parse the nesting to figure out what's in must vs filter vs should. The abstraction hides the actual ES structure instead of making it clearer.

What this library does instead

Think of it like Elasticsearch's own Query DSL — but expressed as Python method chains instead of JSON nesting:

qb = QueryBuilder()

inner = qb.nested_bool()
inner.add_should(QueryBuilder.Match.build("content", "LED"))
inner.add_should(QueryBuilder.MatchPhrase.build("abstract", "light emitting diode"))
inner.add_minimum_should_match(1)

query = (
    qb.add_must(QueryBuilder.Match.build("title", "semiconductor", operator="and"))
      .add_must(inner.build())
      .add_filter(QueryBuilder.Range.build("application_date", gte="20200101", lte="20241231"))
      .add_filter(QueryBuilder.Term.build("status_code", "registered"))
      .add_must_not(QueryBuilder.Term.build("applicant_name", "test"))
      .build()
)

Every line tells you exactly what it does. add_must goes to must. add_filter goes to filter. The structure maps directly to the ES Query DSL you already know.

And dynamic conditions stay clean:

qb = QueryBuilder()

if keyword:
    qb.add_must(QueryBuilder.Match.build("title", keyword, operator="and"))

if date_from or date_to:
    qb.add_filter(QueryBuilder.Range.build("date", gte=date_from, lte=date_to))

if status:
    qb.add_filter(QueryBuilder.Term.build("status", status))

if exclude_test:
    qb.add_must_not(QueryBuilder.Term.build("applicant", "test"))

query = qb.build()

No bracket hell. No setdefault. No manual list management. Just add conditions and build.

This matters for code review. Whether you wrote the query, a teammate wrote it, or an LLM generated it — someone has to verify it does what it's supposed to. Readable structure makes that verification fast.


Installation

pip install elastic-query-builder

Zero dependencies. Works with elasticsearch-py, opensearch-py, or any HTTP client — build() returns a plain dict.


Quick Start

from elastic_query_builder import QueryBuilder

qb = QueryBuilder()
query = (
    qb.add_must(QueryBuilder.Match.build("title", "elasticsearch"))
      .add_filter(QueryBuilder.Range.build("date", gte="2024-01-01"))
      .set_size(10)
      .build()
)

# Use with any ES client
# es.search(index="my-index", body=query)

Real-World Example: Patent Search

from elastic_query_builder import QueryBuilder
from elastic_query_builder.core.enums import SortOrder

qb = QueryBuilder()

# Search conditions
qb.add_must(QueryBuilder.Match.build("productKor", "반도체", operator="and"))
qb.add_must(QueryBuilder.Match.build("abstract", "발광 다이오드", boost=2.0))

# Filters
qb.add_filter(QueryBuilder.Range.build("applicationDate", gte="20200101", lte="20241231"))
qb.add_filter(QueryBuilder.Term.build("statusCode", "registered"))

# Exclusions
qb.add_must_not(QueryBuilder.Term.build("applicantName", "테스트"))

# Sort + Pagination
qb.add_sort("applicationDate", SortOrder.DESC)
qb.set_size(20)
qb.set_from(0)
qb.set_track_total_hits(True)

# Source filtering
qb.set_source_includes(["applicationNumber", "productKor", "applicantName", "applicationDate"])

query = qb.build()

What It Supports

Queries — Bool, Term, Terms, Match, MatchPhrase, Range, Wildcard, Exists, IDs, MatchAll, MatchNone, DisMax, Nested, SpanTerm, SpanNear

Aggregations — Terms, DateHistogram, Histogram, Range, Filter, Filters, Nested (bucket) / Sum, Avg, Min, Max, Stats, Cardinality, TopHits (metric)

Sort — Field sorting, score sorting, script sorting, missing value handling

Source filtering — Includes/excludes control, full disable

Output — Plain dict. No vendor lock-in.

For full API documentation with examples, see the docs/ folder.


Design Principles

  1. Explicit over implicit — method names map directly to ES bool clauses (add_must, add_filter, add_should, add_must_not)
  2. build() returns Dict[str, Any] — predictable, inspectable, serializable
  3. Method chaining via return self — fluent API without sacrificing readability
  4. Zero dependencies — pure Python, no transitive dependency risks
  5. Each query class owns its own output — single responsibility, easy to extend

Contributing

Contributions are welcome. Fork, branch, write tests, open a PR. See docs/ for architecture details.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elastic_query_builder-0.1.2.tar.gz (78.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

elastic_query_builder-0.1.2-py3-none-any.whl (58.8 kB view details)

Uploaded Python 3

File details

Details for the file elastic_query_builder-0.1.2.tar.gz.

File metadata

  • Download URL: elastic_query_builder-0.1.2.tar.gz
  • Upload date:
  • Size: 78.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for elastic_query_builder-0.1.2.tar.gz
Algorithm Hash digest
SHA256 2407047fe99e04d9992021c7b7b3561c8f8c3bd848006a2f0dfcbf1e97a2c0ed
MD5 a3f1299db2b9fb118b4634b0cd3db4ac
BLAKE2b-256 6b2f94bbe15851beb38974ae31c53ce3f3ff6718a9936fdd5002e365f0ccc7e5

See more details on using hashes here.

Provenance

The following attestation bundles were made for elastic_query_builder-0.1.2.tar.gz:

Publisher: publish.yml on junhyeong9812/python-elastic-query-builder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file elastic_query_builder-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for elastic_query_builder-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f116a15e9c28f79c908fdcb508b8ac4aa22012d358fe4ce35aca8bb47064407c
MD5 3ab766906ae99e5ffcdd33e7041596b3
BLAKE2b-256 ea449f5c368eac81c8abee9ed6fe51dcfc97fc96b62818b279e21c4a3482bf2a

See more details on using hashes here.

Provenance

The following attestation bundles were made for elastic_query_builder-0.1.2-py3-none-any.whl:

Publisher: publish.yml on junhyeong9812/python-elastic-query-builder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page