Skip to main content

A Lucene query parser generating ElasticSearch queries and more !

Project description

luqum - A lucene query parser in Python, using PLY

Latest PyPI version Documentation Status travis coveralls

logo

“luqum” (as in LUcene QUery Manipolator) is a tool to parse queries written in the Lucene Query DSL and build an abstract syntax tree to inspect, analyze or otherwise manipulate search queries.

It enables enriching the Lucene Query DSL meanings (for example to support nested object searches or have particular treatments on some fields), and transform lucene DSL queries to native ElasticSearch JSON DSL

Thanks to luqum, your users may continue to write queries like: author.last_name:Smith OR author:(age:[25 TO 34] AND first_name:John) and you will be able to leverage ElasticSearch query DSL, and control the precise meaning of each search terms.

Luqum is dual licensed under Apache2.0 and LGPLv3.

Compatible with Python 3.4+

Installation

pip install luqum

Dependencies

PLY >= 3.11

Changelog for luqum

The format is based on Keep a Changelog and this project tries to adhere to Semantic Versioning.

0.11.0 - 2021-01-06

Changed

  • completely modified the naming module and auto_name function, as it was not practical as is.

Added

  • added tools to build visual explanations about why a request matches a results (leveraging elasticsearch named queries.
  • added a visitor and transformer that tracks path to element while visiting the tree.

Fixed

  • fixed the handling of names when transforming luqum tree to elasticsearch queries and added integration tests.

0.10.0 - 2020-09-22

Added

  • support for parsing Regular expressions like /foo/ (no transformation to Elasticsearch DSL yet)
  • basic support for head and tail of expressions (the separators) and for their position (pos and size) in original text
  • added auto_head_tail util (use it if you build your tree programatically and want a printable representation)
  • tree item now support a clone_item method and a setter for children. This should help with making transformation pattern easier.
  • New visitor.TreeVisitor and visitor.TreeTransformer classes to help in processing trees utils.LuceneTreeVisitor, utils.LuceneTreeVisitorV2 and utils.LuceneTreeTransformer are warned as deprecated (but still works).

Changed

  • support for python 3.8 added, support for python 3.4 and 3.5 dropped
  • better printing of Proximity and Fuzzy items (preserve implicit nature of degree)
  • raise IllegalCharacterError on illegal character found instead of printing and skipping
  • renamed ParseError to ParseSyntaxError, and kept ParseError as a parent exception

Fixed

  • Range item were not checking for bounds type on equality
  • Boost item were not checking for force on equality
  • Reorganize tests

0.9.0 - 2020-07-29

Added

  • support for elasticsearch 7

0.8.1 - 2019-11-01

Added

  • added Apache 2 license, while maintaining LGPLv3+

0.8.0 - 2019-08-02

Added

  • support for multi_match query in ElasticsearchQueryBuilder.

Fixed

  • SchemaAnalyzer, should count non text fields as not_analyzed
  • ElasticsearchQueryBuilder’s field_options parameter can accept match_type instead of type to change request type. This is now the prefered way over type which may more easily conflict with request parameters.

0.7.5 - 2018-10-29

Added

Fixed

  • fixed bug on equality, having more children in one tree than in the other, was not triggering inequality if first nodes were the same !

0.7.4 - 2018-08-28

Added

Fixed

  • fixed bug in luqum.utils.LuceneTreeTransformer when removing node
  • fixed bug in handling approx operator on multiple words in luqum.elasticsearch.visitor.ElasticsearchQueryBuilder
  • test coverage now check branch

0.7.3 - 2018-06-08

Fixed

  • On ElasticSearch query transformation, Luqum was interpreting wildcards in Phrases where as it should not

0.7.2 - 2018-05-14

Fixed

  • adding the zero_terms_query to match_phrase was a mistake (introduced in 0.7.0).

Added

  • 0.7.0 introduced the match query for queries with single words on analyzed fields, whereas before it was using match_phrase. Although this is more coherent, this may cause difficulties on edge cases as this may leads to results different from previous release.

    This behaviour might be disabled using a new match_word_as_phrase parameter to luqum.elasticsearch.visitor.ElasticsearchQueryBuilder. Note that this parameter maybe removed in future release. (the field_options might be used instead on a per field basis).

0.7.1 - 2018-03-20

Fixed

  • version introduced because of a bad upload on pypi (Restructured description error)

0.7.0 - 2018-03-20

Added

  • Support for named queries (see elastic named queries)
  • Helper to automatically create ElasticSearch query builder options from the index configuration, see: luqum.elasticsearch.schema
  • a new arg field_options on luqum.elasticsearch.visitor.ElasticsearchQueryBuilder allows to add parameters to field queries. It also permits to control the type of query for match queries.
  • now for a query with a single word, if the field is analyzed, the transformation to elastic search query will use a “match” query instead of a “match_phrase”. This is more conform in behaviour to what the expression of “query_string” would produce.

Fixed

  • small fix in utils.TreeTransformerV2, which was not removing elements from lists or tuple as stated
  • single word matches, are now match, and not match_phrase
  • match_phrase has the zero_terms_query field, as for match

Changed

  • dropped official Python 3.3 support

0.6.0 - 2017-12-12

Added

  • Manage object fields in elasicsearch transformation

Fixed

  • minor fix, getting better error message when parsing error is at the end of content

Changed

  • better handling of nested fields may lead to shorter requests

0.5.3 - 2017-08-21

Added

  • A class to transform smartly replace implicit operations with explicit one (OR or AND)

Fixed

  • handling of fields names with numbers followed by a number (better handling of time in expressions)

Changed

  • now using ply 3.10

0.5.2 - 2017-05-29

Changed

  • better recursion in the tree transformer util (API Change)

Fixed

  • handling of empty phrases for elasticsearch query builder

0.5.1 - 2017-04-10

a minor release

Changed

  • Better handling of the implicit operator on printing

0.5.0 - 2017-04-04

Changed

  • Operations are now supporting multiple operands (instead of only two). This mitigate the construction of very deep trees.

Fixed

  • fixes and improvement of documentation

0.4.0 - 2016-12-05

Changed

  • The Lucene query checker now checks nested fields before transformation to prevent bad usage

0.3.1 - 2016-11-23

Added

  • Support for nested fields in Elastic Search queries

Changed

  • improved performances by adding a cache to the tree visitor utility

0.3 - 2016-11-21

(Note that 0.2 version was skipped)

Added

  • Transforming Lucene queries to Elastic Search queries
  • Added a new tree visitor TreeVisitorV2 more easy to use

Fixed

  • Improved first tree visitor utility and its tests (API Change)

0.1 - 2016-05-17

This was the initial release of Luqum.

Added

  • the parser and the tree structure
  • the visitor and transformer utils
  • the Lucene query consistency checker
  • the prettify for pretty printing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for luqum, version 0.11.0
Filename, size File type Python version Upload date Hashes
Filename, size luqum-0.11.0.linux-x86_64.tar.gz (82.6 kB) File type Source Python version None Upload date Hashes View
Filename, size luqum-0.11.0-py3-none-any.whl (49.9 kB) File type Wheel Python version py3 Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page