Skip to main content

Tiny tool to checks the search qualities of the Elasticsearch indices.

Project description

pytest

Table of Contents

Overview

Esqa automates the checks the qualities of the Elasticsearch indices as the unit test frameworks such as RSpec or PyTests. Users add the test cases into the setting files and checks if the target indices is build as expected running the command esqa.

Install

$ pip install esqa

Behavior

When we run Esqa, the following steps are executed.

  1. Submit Es query to an Elasticsearch cluster
  2. Get the result ranking from Elasticsearch
  3. Check if the rankings from Es cluster satisfy the conditions described in configuration file

The following is the image.

Esqa overview

Functions

Specifically esqa provides two functions, assertion and compute distance between rankings from two index and query settings.

With assertion function, we can check if the results ranking satisfy the expectation for the specified queries. With distance function, we can see the queries which is much different from previous settings (index and query`).

The successive sections, we see the assertion and distance functions.

Assertion function

Esqa provides the esqa command which check if the queries gets the expected search rankings from Elasticsearch indices.

We run the esqa command specifying the configuration file and target index.

$ esqa assertion --config sample_config.json --index document-index

Configurations

Esqa has the settings file in which we add the test cases. The following is an example of the setting file of esqa. The setting file means that results from Elasticsearch clusters must satisfy the conditions defined in asserts block when we run the defined query (searching engineer to the message field) to the target index.

{
  "cases": [
    {
      "name": "match query",
      "request": {
        "query": {
          "match": {
            "message": {
              "query": "engineer"
            }
          }
        }
      },
      "asserts": [
        {
          "type": "equal",
          "rank": 0,
          "item": {
            "field": "document_id",
            "value": "24343"
          }
        }
      ]
    }
  ]
}

We add all the test cases into cases block. Each test cases have three elements name, request and asserts. name is the name of the test case. request is the target Es query which we want to validate. We add a set of expected behaviors to the asserts block.

The asserts block contains the conditions that search results from Elasticsearch cluster must satisfy. Each condition contains several elements type, rank and item.

Element Summary
type condition types (equalhigherlower
rank rank of the specified item
item item stored in Elasticsearch indices specified in rank element must satisfy

item element specifies the document in Es indices. The item is specified with the field value.

Element Summary
field field name
value value of the field specified in field element

Templates

Sometimes queries in the test cases are almost the same. In such cases, esqa provides templates in the configuration files.

Template files are JSON file which contains an Elasticsearch query with variables.

The following is an example of template file. As we can see, query block contains a variable ${query_str}. The variables are injected from the Esqa configuration file.

{
  "query": {
    "match": {
      "message": {
        "query": "${query_str}"
      }
    }
  }
}

The following is a configuration file which specifies the template file. To uses template files in the configuration file, we add template element in query block. The variables in the specified template file need to be added in the query block. For example the configuration file added a variable query_str defined in template file.

{
  "templates": [
    {
      "name": "basic_query",
      "path": "tests/fixtures/default_template.json"
    }
  ],
  "cases": [
    {
      "name": "match identical",
      "request": {
        "template": "basic_query",
        "query_str": "engineer"
      },
      "asserts": [
        {
          "type": "equal",
          "rank": 0,
          "item": {
            "field": "id",
            "value": "2324"
          }
        }
      ]
    }
  ]
}

Distance function

When we tune the Es indices, we somtimes want to compare the rankings from the previous indices. Esqa computes the comparison between the rankings in the current settings and previous ones.

Before we run the command we prepare the configuration for the esqa distance function. The format is the almost the same as validation settings except that the settings for distance function does not have assert blocks.

{
  "templates": [{
    "name": "basic_query",
    "path": "sample/template.json"
  }],
  "cases": [
    {"request": {"template": "basic_query", "query_str":  "Windows PC"}, "name": "Windows PC"},
    {"request": {"template": "basic_query", "query_str": "Tablet"}, "name": "Tablet"}
  ]
}

Before changing the Es settings, we run the save command to preserve the current ranking.

esqa save --config sample/ranking.json --index sample > output/ranking_before_change.json

Then we change the Es index or query settings and run distance command specifying the ranking file.

esqa distance --config sample/compared_ranking.json --index sample --ranking output/ranking.json
[
  {
    "name": "Windows PC",
    "similarity": 0.5,
    "ranking_pair": [
      [
        "4",
        "6"
      ],
      [
        "5",
        "4"
      ],
      [
        "6",
        "5"
      ]
    ]
  },
  {
    "name": "Tablet",
    "similarity": 0.5416666666666666,
    "ranking_pair": [
      [
        "22",
        "21"
      ],
      [
        "23",
        "22"
      ],
      [
        "3",
        "23"
      ],
      [
        "21",
        "3"
      ]
    ]
  }
]

Or, we can compare between two preserved rankings by distance-rankings command.

esqa distance-rankings --ranking1 output/ranking1.json --ranking2 output/ranking2.json
[
  {
    "name": "Windows PC",
    "similarity": 0.5,
    "ranking_pair": [
      [
        "4",
        "6"
      ],
      [
        "5",
        "4"
      ],
      [
        "6",
        "5"
      ]
    ]
  },
  {
    "name": "Tablet",
    "similarity": 0.5416666666666666,
    "ranking_pair": [
      [
        "22",
        "21"
      ],
      [
        "23",
        "22"
      ],
      [
        "3",
        "23"
      ],
      [
        "21",
        "3"
      ]
    ]
  }
]

Finally, we get the query cases that have been changed significantly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

esqa-0.4.2.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

esqa-0.4.2-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file esqa-0.4.2.tar.gz.

File metadata

  • Download URL: esqa-0.4.2.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.8.10 Darwin/20.6.0

File hashes

Hashes for esqa-0.4.2.tar.gz
Algorithm Hash digest
SHA256 ff785ec63d001c0061aca75597e4c287a65bacc0e8ac7b9b102f93d066b7da97
MD5 966358a3cf51913fb4ed7fd5c670e66c
BLAKE2b-256 893817d17a331e73a88f7e7b2b5541cf180d2ccb7ba85a8cd07ebc0306ba75e6

See more details on using hashes here.

File details

Details for the file esqa-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: esqa-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.8.10 Darwin/20.6.0

File hashes

Hashes for esqa-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 356637befa3c28d60510846cf43f191ca03fe6afa9a2b6dbe02e13dc6fb2818a
MD5 aa1efdca2a080ab3ee10c9c6eed57714
BLAKE2b-256 d8a7ccdb0ed9166a8be128d8569c9d7dd0a7a36653cb0b6f47d05426f4402263

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page