Skip to main content

pySigma backend for Apache Spark/Databricks

Project description

Tests ![Coverage Badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/alexott/GitHub Gist identifier containing coverage badge JSON expected by shields.io./raw/alexott-databricks-sigma-backend.json) Status

Status: experimental, work in progress:

  • Although cidrmatch is generated, you still need to provide corresponding function as UDF (I'll add example later)
  • Requires more testing

pySigma Databricks Backend

This is the Databricks backend for pySigma. It provides the package sigma.backends.databricks with the DatabricksBackend class. Further, it contains the following processing pipelines in sigma.pipelines.databricks:

  • snake_case: convert column names into snake case format

It supports the following output formats:

  • default: plain Databricks/Apache Spark SQL queries
  • dbsql: Databricks SQL queries with rules metadata (title, status) embedded as comment
  • detection_yaml: Yaml markup for my own detection framework

Unbound Keyword Search

The backend supports Sigma rules with unbound keywords (values without field names). These keywords search the raw log line.

Configuration

By default, the backend looks for keywords in a field named raw. You can customize this:

Command Line:

sigma convert -t databricks -O raw_log_field=message rule.yml

Programmatic:

from sigma.backends.databricks import DatabricksBackend

backend = DatabricksBackend(raw_log_field="event_data")

Examples

Simple Keywords (OR logic):

detection:
    keywords:
        - 'EVILSERVICE'
        - 'svchost.exe -n evil'
    condition: keywords

Generates: contains(lower(raw), lower('EVILSERVICE')) OR contains(lower(raw), lower('svchost.exe -n evil'))

Keywords with |all (AND logic):

detection:
    keywords:
        '|all':
            - 'Remove-MailboxExportRequest'
            - ' -Identity '
    condition: keywords

Generates: contains(lower(raw), lower('Remove-MailboxExportRequest')) AND contains(lower(raw), lower(' -Identity '))

Mixed with Field Conditions:

detection:
    selection:
        EventID: 4688
    keywords:
        - 'mimikatz'
    condition: selection and keywords

Generates: EventID = 4688 AND contains(lower(raw), lower('mimikatz'))

Wildcards in Keywords:

detection:
    keywords:
        - '*malware*'      # uses contains()
        - 'cmd.exe*'       # uses startswith()
        - '*.dll'          # uses endswith()
    condition: keywords

Regex Patterns:

detection:
    keywords:
        - '|re': '.*evil(cmd|powershell).*'
    condition: keywords

Generates: raw rlike '.*evil(cmd|powershell).*'

Maintainer

This backend is currently maintained by:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysigma_backend_databricks-0.1.5.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysigma_backend_databricks-0.1.5-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file pysigma_backend_databricks-0.1.5.tar.gz.

File metadata

File hashes

Hashes for pysigma_backend_databricks-0.1.5.tar.gz
Algorithm Hash digest
SHA256 1a34f1c6d5d4e002fc7d75db33e077800eb35c558a357cc8c918e0c649ce3fb6
MD5 42d7fb3cfd8b6ef8eb2d1ec74675771a
BLAKE2b-256 19056306cba6f76cffaf32f49bca7d8465c404b36241502e6b6a82e7ccc2e703

See more details on using hashes here.

Provenance

The following attestation bundles were made for pysigma_backend_databricks-0.1.5.tar.gz:

Publisher: release.yaml on alexott/pySigma-backend-databricks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pysigma_backend_databricks-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for pysigma_backend_databricks-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 b98bddadfd46359d8c0a863f8522778b0b7e7c1a2c8493023ddd152a2b3b9709
MD5 20e0bb4145e720d07219e7a59b59aea8
BLAKE2b-256 a4725266befcdb80c5194313528ea13827c379fb9cb426eb1701276b054b69df

See more details on using hashes here.

Provenance

The following attestation bundles were made for pysigma_backend_databricks-0.1.5-py3-none-any.whl:

Publisher: release.yaml on alexott/pySigma-backend-databricks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page