AI-native code security scanner with cascade analysis and Firecracker-microVM DAST runtime validation

These details have not been verified by PyPI

Project description

Argus Scanner

We don't flag what we can't exploit.

Argus is an AI-native code security scanner combining a cost-tiered LLM harness (Gemini Flash-Lite triage → Sonnet 4.6 → Opus 4.6 escalation) with runtime DAST detonation in a Firecracker microVM and sandbox-verified remediation. Whether the bug is in code your team wrote (SQL injection, auth bypass, deserialization, command injection, crypto misuse) or in code your stack quietly pulled in (a malicious package, a poisoned CLAUDE.md, a backdoored setup.py, a tampered ML checkpoint loader about to run on someone's machine) — Argus detonates it in the sandbox, captures the exploit firing, generates a patch, replays the same exploit against the patched source, and ships the result as a CI gate.

It targets the gap between "this looks suspicious" (pattern-matching SAST) and "this actually exploits something" (manual reverse engineering).

One scanner. Two threat models. Zero false-positive triage.

Open source. BYOK. Apache 2.0.

You pay your providers directly — Anthropic + Google for the cascade, Fly.io for the optional DAST sandbox. Argus collects nothing.

Quick Start

Get from install to first scan in under 60 seconds:

pip install argus-ai-scanner
export ANTHROPIC_API_KEY="your-anthropic-key"
export GEMINI_API_KEY="your-gemini-key"

# Single file
argus scan path/to/suspicious.py

# Whole repo (current directory)
argus scan-repo .

# CI mode — only files changed vs main, SARIF for GitHub Code Scanning
argus scan-repo . --diff origin/main --output sarif --output-file findings.sarif

# Pre-install supply-chain gate — scan a PyPI package + its dep closure
# BEFORE pip installs anything. Blocks day-zero malware at the ingestion boundary.
argus install requests
argus install -r requirements.txt --dry-run        # CI gate without installing
argus install litellm --strict-coverage             # extra-paranoid mode

Without DAST configured the CLI gracefully degrades to cascade-only verdicts. DAST mode (Firecracker sandbox) requires a Fly.io account — see docs/dast-setup.md.

Benchmark Performance

Adversarial regression suite, labeled by a 4-LLM consensus oracle. Methodology, sample size, and per-file breakdown: bench_results/v1_1_launch/launch_report.md.

                       Verdict-exact (higher = better)
Argus (cascade + DAST) ████████████████████  91.3%
Gemini 3.1 Pro         █████████████████░░░  82.6%
Grok 4.3               █████████████████░░░  82.6%
Opus 4.6               █████████████████░░░  78.3%
GPT 5.4                ████████████████░░░░  73.9%

How Argus works (the three pillars)

Argus has three pillars. The capability matrix below shows exactly what each pillar does for each file type.

Pillar 1 — Cascade harness (static + AI analysis)

Every recognized file flows through a cost-tiered model cascade. Deterministic preprocessing first (free, no models): SHA-256, multi-stage deobfuscation (base64 / hex / eval-chain), dependency graphing, attack-vector flagging, AI-file-pattern detection. Files with no outbound intent get dropped before a single token is spent.

Survivors route through a model cascade:

Cascade stage	Model	Cost / file	Decides
Triage	Gemini Flash-Lite	~$0.001	`CLEAN` / `LOW` / `HIGH` routing
Cheap analysis (LOW tier)	Gemini Flash	~$0.02	findings on low-priority files
Default deep analysis (HIGH tier)	Anthropic Claude Sonnet 4.6	~$0.07	findings on high-priority files
High-stakes / borderline escalation	Anthropic Claude Opus 4.6	~$0.15	~20% of HIGH files

The harness emits structured findings: CWE, line, severity, code, explanation, suggested fix, proof-of-concept, behavioral profile, attack chains, composite risk score. Aggregate cost is ~$4.65 per 100-file scan on a realistic workload mix; hard per-file + per-scan cost caps abort runs that exceed your declared budget.

Pillar 2 — DAST runtime detonation

When the harness flags suspicion at sufficient verdict tier, the file moves to a Firecracker microVM (minimal-v2, networked-v2, or ml_tools-v2 image profile) for two phases:

Phase A — exploit testing. Plan an exploit per harness finding, run it in the sandbox, capture syscalls / egress / filesystem writes, classify each finding as CONFIRMED / BLOCKED / UNREACHED / NOT_TESTED based on what actually happened.
Phase B — exploit discovery. Given accumulated evidence, propose NEW hypotheses the harness missed. A deterministic validator gates the proposals; survivors carry forward into the next iteration's Phase A. Up to 3 iterations or until convergence.

This is the layer that kills false positives — a "looks like SQL injection" pattern that the file's own escaping defends against gets BLOCKED, not flagged. And it surfaces what static analysis missed — Phase B has actually found new findings the harness didn't catch.

Pillar 3 — Remediation (fix-and-verify)

When Phase A confirms an exploit on text source (Python, JS / TS, shell), Argus generates a patched version, replays the same exploit attempts against the patched code in the same sandbox, and emits per-finding NEUTRALIZED / STILL_EXPLOITABLE / UNVERIFIABLE with sandbox-grounded evidence. You don't get a remediation suggestion; you get a remediation that's been tested.

Binary artifact policy. For ML artifacts (.pkl / .pt / .bin / .safetensors / .h5 / .onnx), Argus does NOT auto-patch the binary — the model can't emit valid bytecode-level patches and a corrupt patched pickle would mislead the replay. Instead, the remediation pillar emits structured guidance: regenerate the model from a clean training pipeline and serialize using safetensors (which is structurally incapable of carrying executable __reduce__ payloads). Status is UNVERIFIABLE with the guidance in fix_summary.

Opt-out: pass --no-remediation to skip this pillar entirely while keeping the harness + DAST active. Use for compliance scans, CI gates that don't allow source-modification suggestions, read-only audits, or to save ~$0.05/file in patch-generation tokens. The result still includes a structured phase_c block with skipped_reason: "phase_c_disabled_by_config" so downstream consumers can distinguish "remediation off" from "ran and found nothing to fix."

Coverage matrix

What each pillar does, per file type. ✅ = supported, ⚠️ = supported with policy nuance, ⏳ = roadmap, ❌ N/A = not applicable to this format.

File type	Harness analysis	DAST exploit testing	DAST exploit discovery	Remediation
Python (`.py`, `.pyw`, `.pyi`, `.pth`)	✅	✅	✅	✅ patch + replay
JavaScript / TypeScript (`.js`, `.mjs`, `.cjs`, `.jsx`, `.ts`, `.tsx`)	✅	✅	✅	✅ patch + replay
Shell (`.sh`, `.bash`, `.zsh`)	✅	✅	✅	✅ patch + replay
Jupyter notebooks (`.ipynb`)	✅ cell-by-cell decomposition	⏳ roadmap	⏳ roadmap	⏳ roadmap
ML model artifacts (`.pkl`, `.pickle`, `.pt`, `.bin`, `.safetensors`, `.h5`, `.hdf5`, `.keras`, `.onnx`)	✅ pickletools disassembly	✅ load-detonation in sandbox	❌	⚠️ guidance only (no auto-patch — see binary policy)
GitHub Actions workflows (`.github/workflows/*.yml`)	✅ deterministic CI-pattern sweep	⏳ roadmap	⏳ roadmap	⏳ roadmap
Supply-chain manifests (`package.json`, `requirements.txt`, `Cargo.lock`, `go.mod`, `Gemfile`, `Pipfile`, `setup.py`, `pyproject.toml`, `pom.xml`, `build.gradle`, `*.csproj`, etc.)	✅ parsed for deps + lifecycle hooks	❌ N/A (no runtime to detonate)	❌ N/A	❌ N/A
AI-agent config sentinels (`CLAUDE.md`, `AGENTS.md`, `SKILL.md`, `.cursorrules`, `.clinerules`, `mcp.json`, `plugin.json`, `openapi.{yaml,json}`, `agent-config.{yaml,json,toml}`, etc.)	✅ prompt-injection surface	❌ N/A	❌ N/A	❌ N/A
Other languages tagged for harness (Java, Kotlin, Scala, Go, Rust, Ruby, PHP, C#, C/C++, PowerShell, Lua, Perl, R, Swift, Terraform, HCL)	✅ generic harness analysis	⏳ roadmap	⏳ roadmap	⏳ roadmap

Per-finding verdicts (where the FP kill happens)

Every finding ships with one of these statuses:

Status	Meaning
`CONFIRMED`	Sandbox observed the exploit firing. PoC + event trace surfaced with the finding.
`BLOCKED`	Attack was tested; the file's own code defended against it (sanitization, escaping, allowlist).
`UNREACHED`	Attack was tested; the code path is genuinely unreachable.
`NOT_TESTED`	Sandbox couldn't execute the test. Sub-reason: `infra_stub` / `inconclusive` / `not_planned`.

A CONFIRMED finding looks like this:

{
  "cwe": "CWE-200",
  "type": "data_exfiltration",
  "severity": "critical",
  "status": "CONFIRMED",
  "confidence": 1.0,
  "runtime_evidence": "Mock HTTP server at 127.0.0.1:8000 captured POST body containing 'FAKE_PRIVATE_KEY_CONTENT' and 'ssh-rsa AAAAFAKEKEY user@host'. The malware decoded its base64 payload and POSTed the contents of ~/.ssh/ to the rewritten C2 endpoint.",
  "proof_of_concept": "On any Unix host with SSH keys present, execution sends the full contents of ~/.ssh/ to the remote C2 server over HTTPS."
}

DAST cuts three ways: it confirms exploits with sandbox-captured evidence, refutes false positives with proof of non-exploitability, and verifies remediations by replaying the same exploits against the patched source.

Enterprise Invariants

Anthropic's Claude Security and OpenAI's Codex Security are enterprise-tier and vendor-cloud-only. Argus is the open alternative.

BYOK. You control LLM access; bills go to your API meter, not ours.
Zero telemetry. In cascade-only mode, nothing leaves your machine. In DAST mode, file content is sent only to a Fly.io app you own and control — never to Argus-operated infrastructure.
Local execution. Fully self-contained pipeline; no SaaS dependency.

CLI Reference

`argus scan <file>` — single-file scan

Flag	Purpose
`--output {json,markdown}`	Output format (default: `json`)
`--no-dast`	Skip DAST verification (cascade-only)
`--no-remediation`	Skip Phase C (fix-and-verify). Phase A + B still run; no patch is generated. Compliance / CI-gate / read-only-audit use cases. Saves ~$0.05/file.
`--max-cost USD`	Abort this file's scan if per-file API spend exceeds USD (default: $1.00; pass `0` to disable)
`--enable-discovery`	Proactive payload sweep — runs library of attack payloads against the file in sandbox; surfaces runtime-confirmed CWEs as new findings (+~$0.25/file)
`--enable-runtime-probe`	Phase B+ runtime exploit probing (v1.5). Sonnet generates concrete attack inputs targeting probe-attractive functions; sandbox executes each; deterministic rules confirm exploits via runtime evidence (return value, side-effect canaries). Python only in v1.5; opt-in (~$0.20–0.50/file).
`--dast-trigger-verdicts LIST`	Comma-separated L1 verdicts that trigger DAST. Default: `malicious,critical_malicious`. Allowed: `clean,suspicious,malicious,critical_malicious`

`argus scan-repo <path>` — directory tree scan

Flag	Purpose
`--diff REF`	Only scan files differing vs git ref (e.g., `--diff origin/main` for PR/CI)
`--output {markdown,json,sarif}`	Output format (default: `markdown`); `sarif` is SARIF v2.1.0 for GitHub Code Scanning
`--output-file PATH`	Write to file instead of stdout
`--max-cost USD`	Abort the run when cumulative API spend across all files exceeds USD; remaining files are marked `cost_cap_reached`. Pass `0` or omit to disable
`--exclude GLOB`	Additional gitignore-style exclude pattern (repeatable)
`--no-gitignore`	Ignore `.gitignore` during walk (default: respected)
`--max-file-bytes BYTES`	Skip files larger than BYTES (default: 1 MiB)
`--no-dast`	Skip DAST verification on every file
`--no-remediation`	Skip Phase C on every file. Phase A + B still run; no patches generated.
`--enable-discovery`	Proactive payload sweep on every DAST-eligible file
`--enable-runtime-probe`	Phase B+ runtime exploit probing (v1.5) on every DAST-eligible Python file. See `argus scan` for description.
`--dast-trigger-verdicts LIST`	Same as `scan`
`--continue-on-error` / `--no-continue-on-error`	On per-file exception, record and continue (default) or abort run

`argus install <pkg>` — pre-install supply-chain gate

Stages the package via pip download (no setup.py execution), runs the full Argus pipeline on every wheel/sdist in the dependency closure, then either calls real pip install or blocks with the analysis printed. Catches day-zero supply-chain malware at the ingestion boundary — exactly the class advisory-based scanners (pip-audit, safety) miss.

Flag	Purpose
`<pkg>`	Package spec (e.g. `'requests'`, `'litellm==1.50.0'`, `'fastapi[all]'`). Mutually exclusive with `-r`.
`-r PATH` / `--requirement PATH`	Install from a requirements.txt; Argus scans every wheel in the resolved closure.
`--block-on LIST`	Comma-separated verdict tiers that block install. Default: `malicious,critical_malicious`. Use `suspicious,malicious,critical_malicious` for stricter gating.
`--no-dast`	Cascade-only — skip DAST runtime detonation even if Fly is configured. Faster + cheaper, but leaves runtime-only exploits (load-time RCE in pickles, etc.) un-validated.
`--no-cache`	Ignore the wheel-hash verdict cache. Re-scans every artifact from scratch.
`--cache-dir PATH`	Override cache directory (default: `~/.cache/argus/install`).
`--dry-run`	Run the scan + report verdict; do NOT call `pip install`. For CI gating without side effects.
`--strict-coverage`	Escalate verdict to `suspicious` when Argus could only statically analyze <70% of files in a wheel (rest are typically native binaries: `.so`, `.pyd`, `.dll`, `.dylib`, `.exe`). For security-paranoid users / strict CI gates.
`--max-cost USD`	Per-file cost cap (default: $1.00).
`--max-total-cost USD`	Aggregate cost cap across the whole dependency-closure scan (default: $10). When tripped, remaining wheels are flagged `suspicious / unscanned-due-to-cost-cap` and the install fails closed. Pass `0` to disable.
`--deep`	Full-fidelity scan — `thinking_budget=24000` on every Sonnet/Opus call, sequential per-file scan, 4 wheels concurrent. ~5–10× more expensive but catches subtle multi-step exploits the default mode might miss.
`--no-thinking`	Explicit way to set `thinking_budget=0`. Already the install default; flag exists for script readability. Mutually exclusive with `--deep`.
`--parallel N`	Max number of artifacts scanned concurrently (default: 8). Pass lower if you hit API rate limits.
`--enable-runtime-probe`	Phase B+ runtime exploit probing (v1.5) on every DAST-eligible Python file in the dependency closure. Adds ~$0.20–0.50/file. See `argus scan`.
`--pip EXEC`	Pip executable. Default: `pip`. Pass `'uv pip'` for uv-managed envs.
`--output {text,json}`	Output format. Default: text. JSON for CI consumption.

Phase C is always disabled on the install path. Remediation for a not-yet-installed package is "don't install", not "patch + replay." If the cascade flags a malicious verdict, the install is blocked; the user sees the analysis (CWE, runtime evidence, exfil destination) and decides.

Wheel-hash caching. Verdicts are cached at ~/.cache/argus/install/<sha256>.json. Wheel bytes are immutable on PyPI (re-uploads of the same name+version are rejected), so a verdict is permanently valid for that exact artifact. First-run cost is real; subsequent installs of the same wheel are free.

Coverage transparency. A "clean" verdict on a wheel that's 50% native binaries (.so, .pyd) is honestly weaker evidence than a clean verdict on a wheel that's 100% Python — the report says so. Every artifact verdict reports n_files_unscanned + extension histogram. Native binaries are not silently scrubbed from the verdict — coverage warnings surface. --strict-coverage opt-in escalates the verdict on low-coverage artifacts.

Security & Isolation

Argus deliberately detonates potentially malicious code. Host protection is non-negotiable.

Hardware-level isolation. Execution happens inside Firecracker microVMs using KVM hardware virtualization.
Ephemeral state. Every detonation spins up a pristine microVM and is destroyed post-execution. Zero persistence.
Strict egress control. Network profiles enforced at the hypervisor level prevent lateral movement during DAST verification.

Documentation

Topic	Page
Install guide	docs/install.md
API key sourcing	docs/api-keys.md
Architecture deep dive	docs/architecture.md
DAST sandbox setup	docs/dast-setup.md
Cost guide	docs/cost-guide.md
Roadmap	ROADMAP.md
Contributing	CONTRIBUTING.md
Security disclosures	SECURITY.md

License

Apache License 2.0.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.5.0

May 10, 2026

1.3.1

May 10, 2026

1.3.0

May 10, 2026

1.2.1

May 10, 2026

1.2.0

May 8, 2026

1.1.1

May 7, 2026

1.1.0

May 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

argus_ai_scanner-1.5.0.tar.gz (594.3 kB view details)

Uploaded May 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

argus_ai_scanner-1.5.0-py3-none-any.whl (424.5 kB view details)

Uploaded May 10, 2026 Python 3

File details

Details for the file argus_ai_scanner-1.5.0.tar.gz.

File metadata

Download URL: argus_ai_scanner-1.5.0.tar.gz
Upload date: May 10, 2026
Size: 594.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for argus_ai_scanner-1.5.0.tar.gz
Algorithm	Hash digest
SHA256	`9aca1b6e2d6cf98306b03c34371e1464e10b0bb9ecb1195f665dfdd962d63334`
MD5	`9528dfcb4c5b08ab86f555eefdffa6a1`
BLAKE2b-256	`9aa4cc2d7dc0a8b619d62c3567e12f0bf280d734f575595fec01d9951c1533fc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for argus_ai_scanner-1.5.0.tar.gz:

Publisher: release.yml on dshochat/Argus_Scanner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: argus_ai_scanner-1.5.0.tar.gz
- Subject digest: 9aca1b6e2d6cf98306b03c34371e1464e10b0bb9ecb1195f665dfdd962d63334
- Sigstore transparency entry: 1500038230
- Sigstore integration time: May 10, 2026
Source repository:
- Permalink: dshochat/Argus_Scanner@16d7eb2e4266d630147c282eb4a8e46c3ab53694
- Branch / Tag: refs/tags/v1.5.0
- Owner: https://github.com/dshochat
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@16d7eb2e4266d630147c282eb4a8e46c3ab53694
- Trigger Event: push

File details

Details for the file argus_ai_scanner-1.5.0-py3-none-any.whl.

File metadata

Download URL: argus_ai_scanner-1.5.0-py3-none-any.whl
Upload date: May 10, 2026
Size: 424.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for argus_ai_scanner-1.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5ec21c69b3a192cb53a6532fd23045d49e605a20ab6b838d0f2aef8d6a39d1b7`
MD5	`584dc00f64650336ca1eefc85472f399`
BLAKE2b-256	`8ce4cd0023dd233d0c4063380a88fdfb2385a9d051b25775096b609b9172915f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for argus_ai_scanner-1.5.0-py3-none-any.whl:

Publisher: release.yml on dshochat/Argus_Scanner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: argus_ai_scanner-1.5.0-py3-none-any.whl
- Subject digest: 5ec21c69b3a192cb53a6532fd23045d49e605a20ab6b838d0f2aef8d6a39d1b7
- Sigstore transparency entry: 1500038849
- Sigstore integration time: May 10, 2026
Source repository:
- Permalink: dshochat/Argus_Scanner@16d7eb2e4266d630147c282eb4a8e46c3ab53694
- Branch / Tag: refs/tags/v1.5.0
- Owner: https://github.com/dshochat
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@16d7eb2e4266d630147c282eb4a8e46c3ab53694
- Trigger Event: push

argus-ai-scanner 1.5.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Argus Scanner

Quick Start

Benchmark Performance

How Argus works (the three pillars)

Pillar 1 — Cascade harness (static + AI analysis)

Pillar 2 — DAST runtime detonation

Pillar 3 — Remediation (fix-and-verify)

Coverage matrix

Per-finding verdicts (where the FP kill happens)

Enterprise Invariants

CLI Reference

argus scan <file> — single-file scan

argus scan-repo <path> — directory tree scan

argus install <pkg> — pre-install supply-chain gate

Security & Isolation

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`argus scan <file>` — single-file scan

`argus scan-repo <path>` — directory tree scan

`argus install <pkg>` — pre-install supply-chain gate