Skip to main content

AI-powered alert triage summarizer for SOC teams

Project description

sift

  ____ ___ _____ _____
 / ___|_ _|  ___|_   _|
 \___ \| || |_    | |
  ___) | ||  _|   | |
 |____/___|_|     |_|

AI-Powered Alert Triage Summarizer for SOC Teams

sift ingests raw security alerts, deduplicates and clusters related events, scores them by priority, and delivers a structured triage summary — with optional AI-generated analysis. Part of the barb → vex → sift SOC workflow trilogy.


Features

  • Ingest alerts from generic JSON, Splunk exports, or CSV
  • Deduplicate noisy alert streams before analysis
  • Extract IOCs (IPs, domains, hashes, URLs) from alert fields automatically
  • Cluster related alerts by IOC overlap, category + time window, or IP-pair correlation
  • Score clusters across five priority tiers: NOISE / LOW / MEDIUM / HIGH / CRITICAL
  • AI summarization via Anthropic Claude, OpenAI, Ollama (local), or template-based with no LLM required
  • Rich terminal output with priority-colored cluster table
  • Export to JSON, CSV, or STIX 2.1 for downstream tooling
  • Filter clusters using a boolean DSL (--filter 'priority >= HIGH AND ...')
  • Enrich IOCs via barb (phishing URL analysis) and vex (VirusTotal reputation) with --enrich
  • Cache triage results by input fingerprint with --cache (opt-in, 1h TTL)
  • Validate LLM output schema and detect prompt injection attacks
  • sift metrics <file> command for cluster and IOC distribution statistics
  • sift doctor diagnostics to verify configuration, LLM connectivity, and dependencies
  • PyPI version check on startup

Installation

pip install sift-triage

Optional extras:

# LLM summarization (Anthropic + OpenAI)
pip install "sift-triage[llm]"

# IOC enrichment via barb/vex
pip install "sift-triage[enrich]"

# Everything
pip install "sift-triage[llm,enrich]"

Kali Linux / Debian

# Recommended: use pipx for isolated CLI tool installation
sudo apt install pipx   # or: pip install pipx
pipx install sift-triage

# With LLM support
pipx install "sift-triage[llm]"

# With barb + vex enrichment
pipx install "sift-triage[enrich]"

Note: Python 3.11+ required. Kali Linux 2024+ includes Python 3.12 by default. On older systems: sudo apt install python3.12 python3.12-venv


Quick Start

Triage a JSON alert file:

sift triage alerts.json

Triage with AI summarization (Anthropic Claude):

sift triage alerts.json --summarize --provider anthropic

Pipe from Splunk or another tool:

cat splunk_export.json | sift triage -

Export triage report to JSON:

sift triage alerts.json -f json -o report.json

Export triage report as STIX 2.1 bundle:

sift triage alerts.json -f stix -o bundle.json

Filter to HIGH and CRITICAL clusters only:

sift triage alerts.json --filter 'priority >= HIGH'

Enable result caching (skip reprocessing on repeated runs):

sift triage alerts.json --cache

Show metrics for an alert file:

sift metrics alerts.json

Run diagnostics:

sift doctor

Enrich IOCs via barb (phishing URLs) + vex (VirusTotal):

sift triage alerts.json --enrich --summarize

Enrich only via barb (no VirusTotal API key needed):

sift triage alerts.json --enrich --enrich-mode barb

Configuration

sift stores settings in ~/.sift/config.yaml and credentials in ~/.sift/.env (mode 600). Both files are created automatically on first use.

Priority chain: CLI flags > SIFT_LLM_KEY env var > ~/.sift/.env > ~/.sift/config.yaml > defaults

Show current config

sift config --show

Set LLM API key

The API key is stored in ~/.sift/.env and is never written to config.yaml.

sift config --api-key sk-ant-...          # Anthropic Claude
sift config --api-key sk-...              # OpenAI
sift config --unset-api-key               # Remove key

Alternatively, set the SIFT_LLM_KEY environment variable directly.

Set default provider and model

sift config --provider anthropic
sift config --provider openai --model gpt-4o
sift config --provider ollama --model llama3
sift config --provider template           # no LLM required (default)

Set output defaults

sift config --quiet                       # suppress banner by default
sift config --no-quiet                    # re-enable banner
sift config --default-format json         # default output format
sift config --default-format rich         # back to Rich table (default)

Set pipeline defaults

sift config --chunk-size 100             # process large batches in chunks of 100
sift config --chunk-size 0               # disable chunking (default)
sift config --cache                      # enable result caching by default
sift config --no-cache                   # disable caching (default)
sift config --enrich-consent             # pre-approve IOC enrichment (no prompt)
sift config --no-enrich-consent          # require prompt before enrichment (default)

Run sift config --help for the full option reference.


Workflow

sift is the third stage of a SOC analyst trilogy. Use barb to score and flag suspicious URLs in incoming data, pass flagged IOCs to vex for VirusTotal enrichment, then feed the enriched alert data into sift for cluster-level triage and summarization. Each tool is useful standalone; together they cover URL analysis → IOC reputation → alert prioritization in a single scriptable pipeline. The --enrich flag automates barb and vex calls directly from within sift triage.


Input Formats

Format Description Notes
Generic JSON Array of alert objects or NDJSON Any field schema; sift normalizes automatically
Splunk export JSON export from Splunk Search Handles results wrapper and Splunk field names
CSV Comma-separated alert rows First row treated as header; all fields extracted

Pass - as the filename to read from stdin:

splunk-cli export | sift triage -

LLM Providers

Provider Extra Environment Variable Notes
template (none) Default; no LLM required
mock (none) Deterministic mock output for testing and CI
anthropic [llm] ANTHROPIC_API_KEY Claude via Anthropic API
openai [llm] OPENAI_API_KEY GPT via OpenAI API
ollama (none) SIFT_OLLAMA_URL (optional) Local inference; defaults to http://localhost:11434

Set the default provider in ~/.sift/config.yaml or via the SIFT_PROVIDER environment variable.


Enrichment (barb + vex)

The --enrich flag enriches extracted IOCs using the sister tools:

Tool PyPI What it does Required
barb barb-phish Heuristic phishing URL analysis No (local)
vex vex-ioc VirusTotal IOC reputation lookup API key via VT_API_KEY
# Install enrichment extras
pip install "sift-triage[enrich]"

# Run with enrichment
sift triage alerts.json --enrich

# Barb only (no API key needed)
sift triage alerts.json --enrich --enrich-mode barb

# Skip consent prompt
sift triage alerts.json --enrich --yes

sift limits enrichment to 20 IOCs per run to avoid API rate limits.


Output Formats

Flag Output
rich (default) Color-coded cluster table in the terminal
console Plain-text output, safe for logging
json Structured JSON with all cluster and IOC data
csv Flat CSV suitable for SIEM import or spreadsheets
stix STIX 2.1 bundle JSON for threat intelligence platforms

Use -f / --format to select output format, and -o / --output to write to a file.


Advanced Usage

Alert Filtering

Use --filter to apply a boolean DSL to the cluster list after triage. Only matching clusters are included in the output.

# Only HIGH and CRITICAL clusters
sift triage alerts.json --filter 'priority >= HIGH'

# Malware or phishing clusters with more than 3 IOCs
sift triage alerts.json --filter 'category IN (malware, phishing) AND ioc_count > 3'

# Exclude low-signal categories
sift triage alerts.json --filter 'NOT category IN (false_positive)'

# Combine priority and alert count conditions
sift triage alerts.json --filter 'priority >= MEDIUM AND alert_count >= 5'

Supported fields: priority, category, ioc_count, alert_count. Supported operators: >=, <=, >, <, =, IN (...), NOT, AND, OR.

Result Caching

Use --cache to cache triage results by SHA-256 fingerprint of the input. Repeated runs over the same input return instantly from the cache (1-hour TTL, stored in ~/.sift/cache/).

# First run: processes and caches the result
sift triage alerts.json --cache

# Subsequent runs with the same file: returns from cache
sift triage alerts.json --cache

# Combine with other flags; cache stores the full triage output
sift triage alerts.json --cache --summarize --provider anthropic

STIX 2.1 Export Pipeline

Export triage results as a STIX 2.1 threat intelligence bundle for ingestion into SIEM or TIP platforms.

# Export to STIX bundle file
sift triage alerts.json -f stix -o bundle.json

# Combined enrichment and STIX export
sift triage alerts.json --enrich -f stix -o enriched_bundle.json

# Pipe STIX output to another tool
sift triage alerts.json -f stix | jq '.objects | length'

Max Clusters

Limit the number of clusters returned by the pipeline using max_clusters in ~/.sift/config.yaml. When the cluster count exceeds the limit, only the highest-priority clusters are retained. This is useful for large alert volumes where downstream tooling has per-report limits.

clustering:
  max_clusters: 50

Metrics

The sift metrics command runs the full normalization, dedup, and clustering pipeline over an alert file and displays summary statistics without generating a triage report.

sift metrics alerts.json

Output includes:

  • Total cluster count and alert count
  • Average cluster size
  • Top alert categories by frequency
  • IOC type distribution (IPs, domains, hashes, URLs)
  • AI summary success rate (if summaries were previously generated)
# Skip deduplication for raw counts
sift metrics alerts.json --no-dedup

# Use a custom config file
sift metrics alerts.json --config /path/to/config.yaml

Validation and Security

sift validates all LLM outputs against a strict JSON schema (--validate-only runs parse and validate only, then exits):

# Validate parsed structure without rendering output
sift triage alerts.json --validate-only

A built-in prompt injection detector scans LLM inputs for five pattern categories: instruction overrides, output manipulation, JSON escapes, encoded payloads, and shell injection. Suspicious content is flagged and summarization falls back to the template provider automatically.


Exit Codes

Code Meaning
0 Triage complete — no HIGH or CRITICAL clusters found
1 Triage complete — one or more HIGH or CRITICAL clusters found
2 Error — invalid input, configuration failure, or LLM error

Exit code 1 is designed for use in CI pipelines and automated response playbooks.


Configuration

sift config --show    # display current configuration
sift doctor           # verify config, LLM connectivity, and dependencies

Configuration is resolved in priority order: CLI flags > environment variables > ~/.sift/config.yaml > defaults.


Part of the SOC Trilogy

Tool Role PyPI
barb Heuristic phishing URL analyzer barb-phish
vex VirusTotal IOC enrichment vex-ioc
sift Alert triage summarizer sift-triage

License

MIT — see LICENSE for details.

Author: Christian Huhn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sift_triage-1.0.1.tar.gz (138.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sift_triage-1.0.1-py3-none-any.whl (91.4 kB view details)

Uploaded Python 3

File details

Details for the file sift_triage-1.0.1.tar.gz.

File metadata

  • Download URL: sift_triage-1.0.1.tar.gz
  • Upload date:
  • Size: 138.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for sift_triage-1.0.1.tar.gz
Algorithm Hash digest
SHA256 bd4723fa8a8b9500e6e6d11dab73ec2a30a83519dd78082d6bb0b0870c34f919
MD5 1c47cfa5b88a93b95b26c313840571e6
BLAKE2b-256 c2ab4307eb21731277a7c72d2138598d496b364017792f8de57ccfe3d1fa630a

See more details on using hashes here.

File details

Details for the file sift_triage-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: sift_triage-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 91.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for sift_triage-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2135dbf0baedc020e32fa7f32a9ab21604134da824d2c38f74201c2ebf1db049
MD5 20fe9ee5c1f0d44d033cc2e09dd93d1d
BLAKE2b-256 2e54a30abbfe684679caeb1dbcd9c363a38aa8efeadc2fd54c8ff11acd4b4faa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page