pySigma Elasticsearch backend supporting Lucene, ES|QL (with correlations) and EQL queries

These details have not been verified by PyPI

Project description

Tests Coverage Badge Status

pySigma Elasticsearch Backend

This is the Elasticsearch backend for pySigma. It provides the package sigma.backends.elasticsearch with the LuceneBackend class.

It supports the following output formats:

default: Lucene queries.
dsl_lucene: DSL with embedded Lucene queries.
eql: Elastic Event Query Language queries.
kibana_ndjson: Kibana NDJSON with Lucene queries.

Further, it contains the following processing pipelines in sigma.pipelines.elasticsearch:

ecs_windows in windows submodule: ECS mapping for Windows event logs ingested with Winlogbeat.
ecs_windows_old in windows submodule: ECS mapping for Windows event logs ingested with Winlogbeat <= 6.x.
ecs_zeek_beats in zeek submodule: Zeek ECS mapping from Elastic.
ecs_zeek_corelight in zeek submodule: Zeek ECS mapping from Corelight.
zeek_raw in zeek submodule: Zeek raw JSON log field naming.
ecs_kubernetes in kubernetes submodule: ECS mapping for Kubernetes audit logs ingested with Kubernetes integration.
ecs_macos_esf in macos submodule: ECS mapping for macOS Endpoint Security Framework (ESF) events.

This backend is currently maintained by:

Further maintainers required! Send a message to Thomas if you want to co-maintain this backend.

Formats vs. Query Post Processing

While trying to support the minimum compatible output the built-in formats can't fits everyones needs. This gap is filled by a feature called "query post processing" available since pysigma v0.10.

For further information please read "Introducing Query Post-Processing and Output Finalization to Processing Pipelines".

Lucene Kibana NDJSON

Instead of using the format -t lucene -f kibana_ndjson you can also use the following query postprocessing pipeline to get the same output or use this as a starting point for your own customizations.

# lucene-kibana-ndjson.yml
postprocessing:
- type: template
  template: |+
    {"id": "{{ rule.id }}", "type": "search", "attributes": {"title": "SIGMA - {{ rule.title }}", "description": "{{ rule.description }}", "hits": 0, "columns": [], "sort": ["@timestamp", "desc"], "version": 1, "kibanaSavedObjectMeta": {"searchSourceJSON": "{\"index\": \"beats-*\", \"filter\": [], \"highlight\": {\"pre_tags\": [\"@kibana-highlighted-field@\"], \"post_tags\": [\"@/kibana-highlighted-field@\"], \"fields\": {\"*\": {}}, \"require_field_match\": false, \"fragment_size\": 2147483647}, \"query\": {\"query_string\": {\"query\": \"{{ query }}\", \"analyze_wildcard\": true}}}"}}, "references": [{"id": "beats-*", "name": "kibanaSavedObjectMeta.searchSourceJSON.index", "type": "index-pattern"}]}

Use this pipeline with: -t lucene -p lucene-kibana-ndjson.yml but now without -f kibana_ndjson.

Lucene Kibana SIEM Rule

Instead of using the format -t lucene -f siem_rule you can also use the following query postprocessing pipeline to get the same output or use this as a starting point for your own customizations.

# lucene-kibana-siemrule.yml
vars:
  index_names: 
    - "apm-*-transaction*"
    - "auditbeat-*"
    - "endgame-*"
    - "filebeat-*"
    - "logs-*"
    - "packetbeat-*"
    - "traces-apm*"
    - "winlogbeat-*"
    - "-*elastic-cloud-logs-*"
  schedule_interval: 5
  schedule_interval_unit: m
postprocessing:
- type: template
  template: |+
    {
      "name": "SIGMA - {{ rule.title }}",
      "consumer": "siem",
      "enabled": true,
      "throttle": null,
      "schedule": {
        "interval": "{{ pipeline.vars.schedule_interval }}{{ pipeline.vars.schedule_interval_unit }}"
      },
      "params": {
        "author": [
        {% if rule.author is string -%}
          "{{rule.author}}"
        {% else %}
        {% for a in rule.author -%}
          "{{ a }}"{% if not loop.last %},{%endif%}
        {% endfor -%}
        {% endif -%} 
        ],
        "description": "{{ rule.description }}",
        "ruleId": "{{ rule.id }}",
        "falsePositives": {{ rule.falsepositives }},
        "from": "now-{{ pipeline.vars.schedule_interval }}{{ pipeline.vars.schedule_interval_unit }}",
        "immutable": false,
        "license": "DRL",
        "outputIndex": "",
        "meta": {
          "from": "1m"
        },
        "maxSignals": 100,
        "riskScore": (
            self.severity_risk_mapping[rule.level.name]
            if rule.level is not None
            else 21
        ),
        "riskScoreMapping": [],
        "severity": (
            str(rule.level.name).lower() if rule.level is not None else "low"
        ),
        "severityMapping": [],
        "threat": list(self.finalize_output_threat_model(rule.tags)),
        "to": "now",
        "references": {{ rule.references |tojson(indent=6)}},
        "version": 1,
        "exceptionsList": [],
        "relatedIntegrations": [],
        "requiredFields": [],
        "setup": "",
        "type": "query",
        "language": "lucene",
        "index": {{ pipeline.vars.index_names | tojson(indent=6)}},
        "query": "{{ query }}",
        "filters": []
      },
      "rule_type_id": "siem.queryRule",
      "tags": [
        {% for n in rule.tags -%}
        "{{ n.namespace }}-{{ n.name }}"{% if not loop.last %},{%endif%}
      {% endfor -%}
      ],
      "notify_when": "onActiveAlert",
      "actions": []
    }

Use this pipeline with: -t lucene -p lucene-kibana-siemrule.yml but now without -f kibana_ndjson.

EQL siem_rule_ndjson

vars:
  schedule_interval: 5
  schedule_interval_unit: m
postprocessing:
  - type: template
    template: |+
      {%- set tags = [] -%}
      {% for n in rule.tags %}
        {%- set tag_string = n.namespace ~ '-' ~ n.name -%}
        {%- set tags=tags.append(tag_string) -%}
      {% endfor %}

      {%- set rule_data = {
        "name": rule.title,
        "id": rule.id | lower,
        "author": [rule.author] if rule.author is string else rule.author or "",
        "description": rule.description if rule.description else "empty description",
        "references": rule.references,
        "enabled": true,
        "interval": pipeline.vars.schedule_interval|string ~ pipeline.vars.schedule_interval_unit,
        "from": "now-" ~ pipeline.vars.schedule_interval|string ~ pipeline.vars.schedule_interval_unit,
        "rule_id": rule.id | lower,
        "false_positives": rule.falsepositives,
        "immutable": false,
        "output_index": "",
        "meta": {
          "from": "1m"
        },
        "risk_score": rule.custom_attributes.risk_score | default(21),
        "severity": rule.level.name | string | lower if rule.level is not none else 'low',
        "threat": rule.custom_attributes.threat | default([]),
        "severity_mapping": [],
        "to": "now",
        "version": 1,
        "max_signals": 100,
        "exceptions_list": [],
        "setup": "",
        "type": "eql",
        "note": "",
        "license": "DRL",
        "language": "eql",
        "query": query,
        "tags": tags,
        "index": pipeline.state.index,
        "actions": [],
        "related_integrations": [],
        "required_fields": [],
        "risk_score_mapping": []
      }
      -%}
      
      {{ rule_data | tojson }}

Use this pipeline with: -t eql -p eql-siemrule-ndjson.yml but now without -f siem_rule_ndjson. The output can be imported directly into Kibana as a Detection Rule.

ESQL siem_rule_ndjson

vars:
  schedule_interval: 5
  schedule_interval_unit: m
postprocessing:
  - type: template
    template: |+
      {%- set tags = [] -%}
      {% for n in rule.tags %}
        {%- set tag_string = n.namespace ~ '-' ~ n.name -%}
        {%- set tags=tags.append(tag_string) -%}
      {% endfor %}
      {%- set rule_data = {
        "name": rule.title,
        "id": rule.id | lower,
        "author": [rule.author] if rule.author is string else rule.author,
        "description": rule.description,
        "references": rule.references,
        "enabled": true,
        "interval": pipeline.vars.schedule_interval|string ~ pipeline.vars.schedule_interval_unit,
        "from": "now-" ~ pipeline.vars.schedule_interval|string ~ pipeline.vars.schedule_interval_unit,
        "rule_id": rule.id | lower,
        "false_positives": rule.falsepositives,
        "immutable": false,
        "output_index": "",
        "meta": {
          "from": "1m"
        },
        "risk_score": backend.severity_risk_mapping[rule.level.name] if rule.level is not none else 21, 
        "severity": rule.level.name | string | lower if rule.level is not none else "low",
        "severity_mapping": [],
        "threat": backend.finalize_output_threat_model(rule.tags) | list,
        "to": "now",
        "version": 1,
        "max_signals": 100,
        "exceptions_list": [],
        "setup": "",
        "type": "esql",
        "note": "",
        "license": "DRL",
        "language": "esql",
        "index": pipeline.vars.index_names | list,
      "query": query,
      "tags": tags,
      "actions": [],
      "related_integrations": [],
      "required_fields": [],
      "risk_score_mapping": []
      }
      -%}
      
      {{ rule_data | tojson }}

Use this pipeline with: -t esql -p esql-siemrule-ndjson.yml but now without -f siem_rule_ndjson. The output can be imported directly into Kibana as a Detection Rule.

Lucene siem_rule_ndjson

To be continued...

Known Limitations

ES|QL Correlation Rules: Static Time Boundaries

ES|QL correlation rules (event count, value count, and temporal correlation types) use DATE_TRUNC() to assign events to fixed, epoch-aligned time buckets. This means that the timespan window in a correlation rule is aligned to clock boundaries rather than being a true sliding window.

Consequence: A correlation rule with a timespan of 5m only matches if all relevant events appear within the same clock-aligned 5-minute interval (e.g. 17:15:00–17:20:00 or 17:20:00–17:25:00, where the end is exclusive). If events straddle a boundary — some falling in one bucket and some in the next — neither bucket may independently meet the detection threshold, resulting in a false negative.

Example: Five SSH authentication failures spanning the 17:19–17:20 boundary would be split into two buckets (4 events + 1 event). A rule requiring 5 distinct usernames would not fire, even though all 5 events occurred within a 5-minute window.

This behavior is described in the Sigma correlation rules specification and is an intentional trade-off, as ES|QL does not natively support true sliding-window aggregations. Users authoring correlation rules should be aware that detections may be missed when attack activity straddles a clock-aligned bucket boundary.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

2.0.3

May 20, 2026

2.0.2

Jan 22, 2026

2.0.1

Jan 8, 2026

2.0.0

Dec 1, 2025

1.2.0rc1 pre-release

Aug 18, 2025

1.1.6

May 20, 2025

1.1.5

Nov 19, 2024

1.1.4

Nov 15, 2024

1.1.3

Nov 3, 2024

1.1.2

Aug 26, 2024

1.1.1

Jun 20, 2024

1.1.0

Apr 22, 2024

1.0.12

Jan 31, 2024

1.0.10

Jan 11, 2024

1.0.9

Oct 11, 2023

1.0.8

Oct 8, 2023

1.0.7

Sep 2, 2023

1.0.6

Aug 30, 2023

1.0.5

Jul 3, 2023

1.0.4

Jun 27, 2023

1.0.3

Apr 20, 2023

1.0.2

Apr 19, 2023

1.0.1

Apr 15, 2023

1.0.0

Apr 14, 2023

0.2.0

Jan 18, 2023

0.1.2

Jan 7, 2023

0.1.1

Aug 16, 2022

0.1.0

Jul 28, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysigma_backend_elasticsearch-2.0.3.tar.gz (34.1 kB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pysigma_backend_elasticsearch-2.0.3-py3-none-any.whl (41.2 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file pysigma_backend_elasticsearch-2.0.3.tar.gz.

File metadata

Download URL: pysigma_backend_elasticsearch-2.0.3.tar.gz
Upload date: May 20, 2026
Size: 34.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pysigma_backend_elasticsearch-2.0.3.tar.gz
Algorithm	Hash digest
SHA256	`03c35fc78929557289fc611c1a3ce29ad46045ef4a8aaa216599e67e775ae21c`
MD5	`0ac16284d4dacd34b9bb0872a29f851a`
BLAKE2b-256	`f58554a2c6c2b041bc187e256e65e6c2773d07def321cf0c1a4a57d60b814a02`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pysigma_backend_elasticsearch-2.0.3.tar.gz:

Publisher: release.yml on SigmaHQ/pySigma-backend-elasticsearch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pysigma_backend_elasticsearch-2.0.3.tar.gz
- Subject digest: 03c35fc78929557289fc611c1a3ce29ad46045ef4a8aaa216599e67e775ae21c
- Sigstore transparency entry: 1579758314
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: SigmaHQ/pySigma-backend-elasticsearch@c8729cc23d426499356ca56ea8139dc323111736
- Branch / Tag: refs/tags/v2.0.3
- Owner: https://github.com/SigmaHQ
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@c8729cc23d426499356ca56ea8139dc323111736
- Trigger Event: release

File details

Details for the file pysigma_backend_elasticsearch-2.0.3-py3-none-any.whl.

File metadata

Download URL: pysigma_backend_elasticsearch-2.0.3-py3-none-any.whl
Upload date: May 20, 2026
Size: 41.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pysigma_backend_elasticsearch-2.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`131ae1532260c9604cd96616c96ad4b4c642e5e37262ac030f11dad069a30339`
MD5	`dc7a1371a73255b346edc138c7492f23`
BLAKE2b-256	`69e1cd75ae37b85fd16fbf783a7eaa469102a844aff8c7f8bb874158c7d5ae38`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pysigma_backend_elasticsearch-2.0.3-py3-none-any.whl:

Publisher: release.yml on SigmaHQ/pySigma-backend-elasticsearch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pysigma_backend_elasticsearch-2.0.3-py3-none-any.whl
- Subject digest: 131ae1532260c9604cd96616c96ad4b4c642e5e37262ac030f11dad069a30339
- Sigstore transparency entry: 1579758404
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: SigmaHQ/pySigma-backend-elasticsearch@c8729cc23d426499356ca56ea8139dc323111736
- Branch / Tag: refs/tags/v2.0.3
- Owner: https://github.com/SigmaHQ
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@c8729cc23d426499356ca56ea8139dc323111736
- Trigger Event: release

pySigma-backend-elasticsearch 2.0.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

pySigma Elasticsearch Backend

Formats vs. Query Post Processing

Lucene Kibana NDJSON

Lucene Kibana SIEM Rule

EQL siem_rule_ndjson

ESQL siem_rule_ndjson

Lucene siem_rule_ndjson

Known Limitations

ES|QL Correlation Rules: Static Time Boundaries

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance