pySigma backend for Apache Spark/Databricks
Project description

Status: experimental, work in progress:
- Although
cidrmatchis generated, you still need to provide corresponding function as UDF (I'll add example later) - Requires more testing
pySigma Databricks Backend
This is the Databricks backend for pySigma. It provides the package sigma.backends.databricks with the DatabricksBackend class.
Further, it contains the following processing pipelines in sigma.pipelines.databricks:
snake_case: convert column names into snake case format
It supports the following output formats:
- default: plain Databricks/Apache Spark SQL queries
- dbsql: Databricks SQL queries with rules metadata (title, status) embedded as comment
- detection_yaml: Yaml markup for my own detection framework
Unbound Keyword Search
The backend supports Sigma rules with unbound keywords (values without field names). These keywords search the raw log line.
Configuration
By default, the backend looks for keywords in a field named raw. You can customize this:
Command Line:
sigma convert -t databricks -O raw_log_field=message rule.yml
Programmatic:
from sigma.backends.databricks import DatabricksBackend
backend = DatabricksBackend(raw_log_field="event_data")
Examples
Simple Keywords (OR logic):
detection:
keywords:
- 'EVILSERVICE'
- 'svchost.exe -n evil'
condition: keywords
Generates: contains(lower(raw), lower('EVILSERVICE')) OR contains(lower(raw), lower('svchost.exe -n evil'))
Keywords with |all (AND logic):
detection:
keywords:
'|all':
- 'Remove-MailboxExportRequest'
- ' -Identity '
condition: keywords
Generates: contains(lower(raw), lower('Remove-MailboxExportRequest')) AND contains(lower(raw), lower(' -Identity '))
Mixed with Field Conditions:
detection:
selection:
EventID: 4688
keywords:
- 'mimikatz'
condition: selection and keywords
Generates: EventID = 4688 AND contains(lower(raw), lower('mimikatz'))
Wildcards in Keywords:
detection:
keywords:
- '*malware*' # uses contains()
- 'cmd.exe*' # uses startswith()
- '*.dll' # uses endswith()
condition: keywords
Regex Patterns:
detection:
keywords:
- '|re': '.*evil(cmd|powershell).*'
condition: keywords
Generates: raw rlike '.*evil(cmd|powershell).*'
Maintainer
This backend is currently maintained by:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pysigma_backend_databricks-0.1.5.tar.gz.
File metadata
- Download URL: pysigma_backend_databricks-0.1.5.tar.gz
- Upload date:
- Size: 11.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a34f1c6d5d4e002fc7d75db33e077800eb35c558a357cc8c918e0c649ce3fb6
|
|
| MD5 |
42d7fb3cfd8b6ef8eb2d1ec74675771a
|
|
| BLAKE2b-256 |
19056306cba6f76cffaf32f49bca7d8465c404b36241502e6b6a82e7ccc2e703
|
Provenance
The following attestation bundles were made for pysigma_backend_databricks-0.1.5.tar.gz:
Publisher:
release.yaml on alexott/pySigma-backend-databricks
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pysigma_backend_databricks-0.1.5.tar.gz -
Subject digest:
1a34f1c6d5d4e002fc7d75db33e077800eb35c558a357cc8c918e0c649ce3fb6 - Sigstore transparency entry: 790547979
- Sigstore integration time:
-
Permalink:
alexott/pySigma-backend-databricks@5ff3b80e74d7d997581160eb390eb0bcb18f2395 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/alexott
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@5ff3b80e74d7d997581160eb390eb0bcb18f2395 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pysigma_backend_databricks-0.1.5-py3-none-any.whl.
File metadata
- Download URL: pysigma_backend_databricks-0.1.5-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b98bddadfd46359d8c0a863f8522778b0b7e7c1a2c8493023ddd152a2b3b9709
|
|
| MD5 |
20e0bb4145e720d07219e7a59b59aea8
|
|
| BLAKE2b-256 |
a4725266befcdb80c5194313528ea13827c379fb9cb426eb1701276b054b69df
|
Provenance
The following attestation bundles were made for pysigma_backend_databricks-0.1.5-py3-none-any.whl:
Publisher:
release.yaml on alexott/pySigma-backend-databricks
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pysigma_backend_databricks-0.1.5-py3-none-any.whl -
Subject digest:
b98bddadfd46359d8c0a863f8522778b0b7e7c1a2c8493023ddd152a2b3b9709 - Sigstore transparency entry: 790547987
- Sigstore integration time:
-
Permalink:
alexott/pySigma-backend-databricks@5ff3b80e74d7d997581160eb390eb0bcb18f2395 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/alexott
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@5ff3b80e74d7d997581160eb390eb0bcb18f2395 -
Trigger Event:
push
-
Statement type: