Streaming log intelligence agent — detects operational failures and security threats with online ML
Project description
Seerflow
A streaming, entity-centric log intelligence agent that detects operational failures and security threats across log sources. Combines traditional ML (fast, cheap) for bulk detection with Sigma rules (3,000+ community detections) for known threat patterns.
Status
Alpha — Full ingestion + detection + Sigma rules pipeline operational.
Quick Start
# Install from source
git clone https://github.com/seerflow/seerflow.git
cd seerflow
uv sync
# Copy and edit the example config
cp seerflow.example.yaml seerflow.yaml
# Start the pipeline (also serves the React dashboard)
uv run python -m seerflow start
# → React dashboard: http://127.0.0.1:8080/
# → REST API: http://127.0.0.1:8080/api/v1/
# → WebSocket stream: ws://127.0.0.1:8080/api/v1/ws
A single seerflow start boots the receivers, detection engines, and the
FastAPI dashboard on dashboard_port (default 8080). No second uvicorn
process is required — the wheel ships the built React assets and the CLI
mounts them via the same FastAPI app that exposes /api/v1/*.
Command Line
# Start with default config (seerflow.yaml in current directory)
uv run python -m seerflow start
# Start with a specific config file
uv run python -m seerflow --config /path/to/seerflow.yaml start
# Show version
uv run python -m seerflow --version
Inspect loaded detection rules
# List everything
uv run python -m seerflow rules list
# Only rules tagged with a MITRE technique (prefix match includes sub-techniques)
uv run python -m seerflow rules list --technique T1053
# Filter by tactic (name or ATT&CK ID)
uv run python -m seerflow rules list --tactic persistence
uv run python -m seerflow rules list --tactic TA0003
# JSON for scripting
uv run python -m seerflow rules list --format json
Docker
# Build and run with SQLite defaults (zero config)
docker compose up -d
# Run with PostgreSQL (set password first)
export POSTGRES_PASSWORD=your-secure-password
docker compose --profile postgres up -d
# Or run standalone from a registry image
docker run -p 8080:8080 -p 4317:4317 -p 514:514/udp seerflow/seerflow
# Mount a custom config
docker run -v ./seerflow.yaml:/app/seerflow.yaml:ro seerflow/seerflow
What It Does
- Ingests logs from multiple sources simultaneously (syslog, OTLP gRPC/HTTP, file tailing, webhooks)
- Parses each log line with Drain3 (template extraction) and regex entity extraction (IPs, users, hosts, files, domains, processes)
- Resolves entities to deterministic UUID5 IDs for cross-source correlation
- Scores events with an ML ensemble: Half-Space Trees (content), Holt-Winters (volume), CUSUM (change), Markov chains (sequence) -- blended with z-normalization
- Thresholds scores with biDSPOT (EVT-based auto-threshold -- no manual tuning)
- Evaluates 63 bundled Sigma rules (Linux, web, DNS, process, network) with MITRE ATT&CK tagging
- Graphs entity relationships with igraph -- PageRank, Louvain, fan-out, betweenness centrality
- Accumulates per-entity risk with exponential decay -- catches slow-burn multi-step attacks
- Alerts on anomalies, Sigma matches, and risk threshold exceedances
- Persists all events, alerts, graph edges, and ML model state to SQLite
Example: Detect Anomalies in Syslog
# seerflow.yaml
receivers:
syslog_enabled: true
syslog_udp_port: 5514 # use high port to avoid root
otlp_grpc_enabled: false
otlp_http_enabled: false
webhook_enabled: false
detection:
hst_window_size: 100 # lower for faster calibration
dspot:
calibration_window: 200
risk_level: 0.01 # more sensitive for testing
# Terminal 1: Start Seerflow
uv run python -m seerflow start
# Terminal 2: Send normal traffic
for i in $(seq 1 300); do
echo "<134>1 2026-03-24T19:00:00Z web nginx $i - - GET /api/v$((i%5)) 200 ${i}ms" \
| nc -u -w1 127.0.0.1 5514
done
# Terminal 2: Send anomalies
echo '<11>1 2026-03-24T19:01:00Z db postgres 999 - - FATAL connection limit exceeded 847/100' \
| nc -u -w1 127.0.0.1 5514
Output:
INFO Seerflow 0.3.0 starting
INFO Receivers: syslog
INFO Pipeline running — Ctrl+C to stop
WARNING ANOMALY [syslog] score=0.952 threshold=0.009 dir=upper
WARNING template: [7] <*> <*> postgres <*> - - FATAL connection limit exceeded <*>
WARNING message: <11>1 2026-03-24T19:01:00Z db postgres 999 - - FATAL connection limit exceeded 847/100
WARNING entities: 203.0.113.1
Shutdown Summary
Press Ctrl+C to see session stats:
INFO --- Session Summary ---
INFO Events processed: 312
INFO Anomalies detected: 10
INFO Unique templates: 7
INFO Duration: 45.3s
INFO Throughput: 7 events/sec
INFO Seerflow stopped
Configuration
See SETTINGS.md for the complete configuration reference.
All settings are optional -- Seerflow runs with sensible defaults (zero-config).
Key config sections:
- receivers -- syslog, OTLP gRPC/HTTP, file tailing, webhooks (enable/disable + ports)
- detection -- HST window size, DSPOT calibration, scoring weights, custom Sigma rule directories
- storage -- SQLite (default) or PostgreSQL
- alerting -- dedup window, webhook/PagerDuty targets
Receivers
| Receiver | Port | Protocol | Status |
|---|---|---|---|
| Syslog UDP/TCP | 514 (5514) | RFC 5424/3164 | Done |
| OTLP gRPC | 4317 | Protobuf | Done |
| OTLP HTTP | 4318 | Protobuf + JSON | Done |
| File tailing | -- | Glob + watchfiles | Done |
| Webhooks | 8081 | JSON/form + auth | Done |
Detection Pipeline
Log Sources → Receivers → Drain3 → UUID5 Entities → ML Ensemble → Sigma Rules
↓ ↓ ↓
Entity Graph blended score ATT&CK tags
Window Buffer [0.0 - 1.0] tactic/technique
Risk Register ↓ ↓
↓ Risk Accumulation → Alert
PageRank, Louvain
Fan-out, Betweenness
- Drain3: Streaming log template extraction (120K msgs/sec)
- UUID5 Entity Resolution: Deterministic cross-source entity IDs (same entity = same UUID)
- Half-Space Trees: Content anomaly detection via River (constant time/memory)
- Holt-Winters: Volume anomaly detection (trend + seasonal decomposition)
- CUSUM: Change-point detection (bidirectional cumulative sum)
- Markov Chains: Sequence anomaly detection (per-entity transition matrices)
- biDSPOT: Bidirectional EVT auto-threshold (upper spikes + lower drops)
- DetectionEnsemble: Orchestrates all detectors + blended scoring per source
- Sigma Engine: 63 bundled SigmaHQ rules with logsource-indexed dispatch
- Entity Graph: igraph-backed relationship graph with typed edges + 6 algorithms
- Risk Accumulation: Per-entity risk register with exponential decay + configurable threshold
- Sliding Window: Per-entity event buffer with watermark-based late arrival tolerance
Development
Requires Python 3.11+ and uv.
# Install dependencies
uv sync
# Run tests
uv run pytest
# Run quality gates
uv run ruff check . && uv run ruff format --check . && uv run mypy src/ && uv run bandit -r src/ -c pyproject.toml && uv run pytest --cov=src/seerflow --cov-fail-under=95
Project Structure
src/seerflow/
__main__.py # CLI entry point (config → pipeline → detection → storage)
cli.py # argparse (--config, --version)
config.py # YAML config loader with ${ENV_VAR} interpolation
models/ # SeerflowEvent, Alert, entity structs (msgspec)
storage/
protocols.py # Protocol interfaces (LogStore, AlertStore, ModelStore, EntityStore)
sqlite.py # SQLite backend (WAL, FTS5, WriteBuffer)
migrations.py # Schema versioning + forward-only migration runner
receivers/
base.py # RawEvent dataclass, Receiver protocol
manager.py # ReceiverManager (bounded queue, backpressure, shutdown)
syslog.py # UDP/TCP syslog (RFC 5424/3164)
otlp_grpc.py # OTLP gRPC receiver (protobuf LogRecord)
otlp_http.py # OTLP HTTP receiver (/v1/logs, protobuf + JSON)
file_tail.py # File tailing (glob, rotation, checkpoint)
webhook.py # Webhooks (JSON/form, field mapping, auth)
parsing/
drain.py # Drain3 wrapper for template extraction
entities.py # Regex entity extraction (6 types, params-aware tagging)
normalizer.py # EventNormalizer: RawEvent → SeerflowEvent
detection/
protocols.py # Detector Protocol (score, learn, serialize, deserialize)
hst.py # Half-Space Trees detector (River)
threshold.py # biDSPOT auto-threshold (scipy GPD)
ensemble.py # DetectionEnsemble orchestrator (4 detectors + blended scoring)
sigma/
engine.py # SigmaEngine: rule loading, logsource dispatch, evaluation
matcher.py # Custom detection matcher (condition tree walker, regex cache)
pipeline.py # pySigma processing pipeline (22 field mappings)
attack.py # MITRE ATT&CK tactic/technique extraction
bundled.py # Bundled rule path discovery (importlib.resources)
loader.py # Custom rule directory discovery + validation
rules/ # 63 curated SigmaHQ YAML rules (linux, web, dns, process, network)
graph/
entity_graph.py # igraph wrapper: vertices, edges, queries, algorithms
edges.py # Typed edge inference from entity pairs
algorithms.py # PageRank, Louvain, fan-out, fan-in, betweenness, ego-graph
correlation/
window.py # Per-entity sliding window buffer (deque, LRU eviction)
watermark.py # Watermark-based late arrival tolerance
risk.py # Risk accumulation with exponential decay
pipeline/
handler.py # Event handler: parse → detect → graph → correlate → store
run.py # Pipeline runner (config → receivers → handler → storage)
tests/
unit/ # 1200+ unit tests
integration/ # Integration tests (pipeline, graph, correlation, real SQLite)
benchmarks/ # Throughput benchmarks (pytest-benchmark, CI history tracking)
Benchmarks
uv run pytest tests/benchmarks/ --benchmark-autosave
uv run pytest tests/benchmarks/ --benchmark-compare
| Component | Throughput |
|---|---|
| Syslog parse | ~561K msgs/sec |
| Drain3 templates | ~120K msgs/sec |
| Entity extraction | ~41K msgs/sec |
| Full normalizer | ~39.5K msgs/sec |
| Full pipeline (parse + ML + Sigma + storage) | ~1,800 events/sec |
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seerflow-0.5.1.tar.gz.
File metadata
- Download URL: seerflow-0.5.1.tar.gz
- Upload date:
- Size: 5.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0163907c296f4a81106c8aad86f2938441fedb5fa9cc09a531320d6d9118e90f
|
|
| MD5 |
3218350feb9c16582a54c19742862864
|
|
| BLAKE2b-256 |
99635537acc47ba21222d1e1a9b54b0b6d0f91a32ae84dff7e4f64690cf6d633
|
Provenance
The following attestation bundles were made for seerflow-0.5.1.tar.gz:
Publisher:
release.yml on seerflow/seerflow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
seerflow-0.5.1.tar.gz -
Subject digest:
0163907c296f4a81106c8aad86f2938441fedb5fa9cc09a531320d6d9118e90f - Sigstore transparency entry: 1526582214
- Sigstore integration time:
-
Permalink:
seerflow/seerflow@52298d06ba22c03f21a33d3c89410b4b5d2b8c40 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/seerflow
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@52298d06ba22c03f21a33d3c89410b4b5d2b8c40 -
Trigger Event:
push
-
Statement type:
File details
Details for the file seerflow-0.5.1-py3-none-any.whl.
File metadata
- Download URL: seerflow-0.5.1-py3-none-any.whl
- Upload date:
- Size: 4.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7338bc6733d593a9e34b2726042a6268dcf52180bbc9aa604f4cacc738a8e8e8
|
|
| MD5 |
6766114bba771d0c6e2db740de894a3d
|
|
| BLAKE2b-256 |
fa087a342fc3a3cefb4aa0906507546b117796b67b2a8e9c5c1649cfb1f08624
|
Provenance
The following attestation bundles were made for seerflow-0.5.1-py3-none-any.whl:
Publisher:
release.yml on seerflow/seerflow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
seerflow-0.5.1-py3-none-any.whl -
Subject digest:
7338bc6733d593a9e34b2726042a6268dcf52180bbc9aa604f4cacc738a8e8e8 - Sigstore transparency entry: 1526582305
- Sigstore integration time:
-
Permalink:
seerflow/seerflow@52298d06ba22c03f21a33d3c89410b4b5d2b8c40 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/seerflow
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@52298d06ba22c03f21a33d3c89410b4b5d2b8c40 -
Trigger Event:
push
-
Statement type: