PicoSentry โ deterministic supply-chain scanner for npm/pnpm, safe for ML pipelines
Project description
PicoSentry ๐ฆ
Deterministic, offline supply-chain scanner for npm/pnpm โ safe for ML pipelines.
Same inputs + same corpus version = same findings and scan fingerprint. Every time.
No HTTP at scan time. No probabilistic heuristics. No narrative in findings.
Note on determinism: Default JSON output includes audit timestamps and timing data (useful for forensics). For byte-identical output across runs, use
--deterministic-outputor--verify-determinism, which omit timestamps and timing for reproducible CI artifacts.
Quick Start
# Install
pip install picosentry
# Or from source
pip install -e .
# Or use Docker
docker build -t picosentry .
docker run --rm -v $(pwd):/scan picosentry scan /scan
# Scan a project
picosentry scan ./my-project
# CI-friendly health check (exit 1 on HIGH+CRITICAL only)
picosentry check ./my-project --fail-on high
# JSON output (deterministic, sorted keys)
picosentry scan ./my-project --format json
# Monorepo workspace scan
picosentry workspace . --format table
# CycloneDX SBOM
picosentry scan ./my-project --format cyclonedx
# Verify determinism โ run twice, assert SHA-256 match (exit 0=match, 4=violation)
picosentry scan ./my-project --verify-determinism
# Manage custom IoC corpus packs
picosentry corpus export ./my-iocs.json
picosentry corpus import ./community-pack.json
picosentry corpus list
# Manage custom IoCs
picosentry ioc register ./suspicious-pkg.json
picosentry ioc list
CLI Reference
picosentry scan <target> [OPTIONS] Scan a project directory
picosentry check <target> [OPTIONS] CI-optimized health check (exit-code only)
picosentry workspace <root> [OPTIONS] Scan entire monorepo (discovers all npm/pnpm projects)
picosentry corpus export <output> Export custom IoCs as a shareable pack
picosentry corpus import <path> Import a corpus pack into your IoC registry
picosentry corpus validate <path> Validate a corpus pack without importing
picosentry corpus list List available corpus packs
picosentry ioc register <path> Register a custom IoC indicator
picosentry ioc list List user-registered custom IoCs
picosentry ioc remove <id> Remove a custom IoC by ID
picosentry rules [--json] List available detector rules
picosentry version Show version, corpus version, rule count
picosentry diff <a.json> <b.json> Compare two scan files for determinism
picosentry init [target] [--force] Generate .picosentry.yml template
picosentry update [--top N] Download latest npm corpus (requires network)
Scan Options:
--format, -f json, sarif, table, ml-context, github, cyclonedx (default: table)
--output, -o Write output to file instead of stdout
--rules, -r Run only specific rules (e.g., L2-POST-001 L2-TYPO-001)
--corpus, -c Path to corpus directory (default: built-in)
--no-color Disable colored output (table format only)
--token-budget Token budget for ml-context format (default: 4096)
--exit-code Exit with code 1 if findings found
--fail-on Exit 1 only if findings at or above severity (implies --exit-code)
--quiet, -q Summary only, no detailed findings
--summary One-line summary for CI notifications
--baseline, -b Path to baseline JSON or ignore file
--baseline-update Write updated baseline after filtering
--verbose, -v Show per-rule timing and scan details on stderr
--timeout Scan timeout in seconds (0 = no timeout, exit code 3 on timeout)
--log-format Log output format: text (default) or json for SIEM integration
--fail-on-rule-error Exit code 4 if any detector rule raises an exception (fail-closed)
--verify-determinism Run scan twice and verify SHA-256 determinism
--deterministic-output Omit timestamps and timing for byte-stable JSON output
--sarif-file Path for SARIF output file for --format github (default: sarif.json)
Claw Pinch Branding
Human-facing table output uses lobster-themed severity labels:
| Standard Severity | PicoSentry Label |
|---|---|
| CRITICAL / HIGH | HARD PINCH ๐ฆ |
| MEDIUM | SOFT PINCH |
| LOW / INFO | NUDGE |
Clean scan: "No pinches. All clear. ๐ฆ"
Machine formats (JSON, SARIF, CycloneDX, ml-context) use standard severity labels for CI/CD compatibility.
Design Principles
- Deterministic by construction:
sha256(scan_a) == sha256(scan_b)on identical inputs + corpus version - Offline at scan time: No HTTP calls during scanning. Corpus is local and versioned.
- Pure functions: Rules are
(target_path, corpus_dir) โ List[Finding]. No global state, no randomness. - No narrative in findings: Output is structured data. The consumer formats.
- ML-safe:
--format ml-contextproduces compact, token-budgeted output designed for LLM tool results. - SBOM-first:
--format cyclonedxproduces a full CycloneDX 1.5 SBOM with component inventory, purl, and hash verification. - CI/CD native: Sigstore-signed releases, GitHub Actions, pre-commit hooks, workspace scanning,
--fail-on-rule-error.
Configuration File
PicoSentry reads .picosentry.yml from the target directory (or .picosentry.yaml / picosentry.config.yml). Config file values are defaults; CLI flags override them.
version: 1
# Output format: json, sarif, table, ml-context, github, cyclonedx
format: json
# Disable colored output
no_color: true
# Exit with code 1 if findings found
exit_code: true
# Only fail CI on HIGH or above
fail_on: high
# Suppress known findings from previous scan
baseline: baseline.json
# Severity overrides โ downgrade/upgrade rule severity
severity_overrides:
L2-PROV-001: INFO # Downgrade provenance to info
L2-FORK-001: LOW # Downgrade fork drift to low
# Token budget for ml-context format
token_budget: 2048
Workspace Scanning
Scan entire monorepos with one command. Supports pnpm workspaces, Nx, Turborepo, Lerna.
picosentry workspace . --format json
picosentry workspace . --fail-on high --quiet
Custom IoC Registry
Register your own indicators of compromise. Never leaves your machine.
# Register a custom IoC
picosentry ioc register ./suspicious-package.json
# Export for sharing across teams
picosentry corpus export ./team-iocs.json --name "acme-corp-iocs"
# Import community packs
picosentry corpus import ./community-threats.json
# Validate a pack before importing
picosentry corpus validate ./pack.json
Sigstore Verification
Every release is signed via Sigstore with OIDC identity from GitHub Actions.
# Verify a release artifact
./scripts/verify_release.sh v0.15.0
# Or manually
python -m sigstore verify identity \
--cert-identity "https://github.com/KirkForge/PicoSentry/.github/workflows/release.yml@refs/tags/v0.15.0" \
--cert-oidc-issuer "https://token.actions.githubusercontent.com" \
picosentry-0.15.0-py3-none-any.whl
Pre-commit Hooks
Drop into any npm/pnpm project's .pre-commit-config.yaml:
repos:
- repo: https://github.com/KirkForge/PicoSentry
rev: v0.15.0
hooks:
- id: picosentry-scan # Full 19-rule scan
- id: picosentry-check # Fast CI check (HIGH+CRITICAL only)
- id: picosentry-workspace # Monorepo scan
Supply Chain Attack Coverage (21 Rules)
| Rule ID | Attack Vector | Severity |
|---|---|---|
| L2-POST-001 | Post-install scripts | HIGH |
| L2-OBFS-001 | eval / Function obfuscation | HIGH |
| L2-OBFS-002 | Hex-encoded payloads | MEDIUM |
| L2-OBFS-003 | base64+eval obfuscation | MEDIUM |
| L2-OBFS-004 | Unicode escape obfuscation | LOW |
| L2-DEPC-001 | Dependency confusion | HIGH |
| L2-TYPO-001 | Typosquatting | HIGH |
| L2-MANI-001 | Manifest tampering | HIGH |
| L2-MANI-002 | Optional deps with scripts | MEDIUM |
| L2-FORK-001 | Fork drift | HIGH |
| L2-CRED-001 | Credential / secret leak | HIGH |
| L2-LOCK-001 | Lockfile drift | MEDIUM |
| L2-BUND-001 | Bundled shadow dependencies | HIGH |
| L2-PROV-001 | Provenance / integrity | MEDIUM |
| L2-MAINT-001 | Maintainer change / takeover | HIGH |
| L2-PNPM-001 | pnpm dangerous config | HIGH |
| L2-LICENSE-001 | License issues | MEDIUM |
| L2-ENGIN-001 | Engine issues | LOW |
| L2-SIDELOAD-001 | Protocol sideloading | MEDIUM |
| L2-IOC-001 | Custom IoC detection | INFOโCRITICAL |
| L2-ADV-001 | Advisory vulnerability (OSV/GHSA/npm) | MEDIUMโCRITICAL |
See SCAAT.md for the full attack-vector-to-rule mapping with confidence levels.
Deterministic Guard Stack
PicoSentry enforces determinism at four layers:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Layer 4: CI Gate โ
โ --verify-determinism (CLI) โ
โ Runs scan twice, asserts SHA-256 match โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Layer 3: Diff โ
โ picosentry diff a.json b.json โ
โ Compare two saved scans field-by-field โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Layer 2: Guard (runtime) โ
โ Validates invariants after each scan โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Layer 1: Models (structural) โ
โ Frozen dataclasses, sorted keys โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Architecture
picosentry/
โโโ __init__.py # Public API + __version__
โโโ __main__.py # python -m picosentry
โโโ cli.py # CLI entry point (scan, check, workspace, corpus, rules, diff, init, update)
โโโ engine.py # ScanEngine orchestrator (per-rule timing, corpus resolution)
โโโ guards.py # Deterministic guard stack (enforcement + verification)
โโโ models.py # Finding, ScanResult, ScanStats, BaselineResult (frozen dataclasses)
โโโ config.py # .picosentry.yml loader + merge
โโโ logging.py # Structured JSON logging for SIEM
โโโ workspace.py # Multi-project/monorepo workspace scanning
โโโ ioc_registry.py # Custom IoC registration + management
โโโ corpus_share.py # Corpus pack export/import/validate (marketplace)
โโโ corpus/
โ โโโ npm_top_packages.json # 327 top npm packages (typosquat targets)
โ โโโ ioc/ # IoC metadata (event-stream, Shai-Hulud, left-pad, etc.)
โโโ rules/
โ โโโ post_install.py # L2-POST-001
โ โโโ obfuscation.py # L2-OBFS-001..004
โ โโโ dep_confusion.py # L2-DEPC-001
โ โโโ typosquat.py # L2-TYPO-001
โ โโโ manifest.py # L2-MANI-001/002
โ โโโ fork_drift.py # L2-FORK-001
โ โโโ credential_read.py # L2-CRED-001
โ โโโ pnpm_lock_parser.py # pnpm-lock.yaml v6+ parser
โ โโโ lockfile_drift.py # L2-LOCK-001
โ โโโ bundled_shadow.py # L2-BUND-001
โ โโโ provenance.py # L2-PROV-001
โ โโโ maintainer_change.py # L2-MAINT-001
โ โโโ pnpm_config.py # L2-PNPM-001
โ โโโ license.py # L2-LICENSE-001
โ โโโ engine.py # L2-ENGIN-001
โ โโโ sideloading.py # L2-SIDELOAD-001
โโโ formatters/
โ โโโ json_fmt.py # Deterministic JSON (sorted keys)
โ โโโ sarif.py # SARIF 2.1.0
โ โโโ table.py # Human-readable with claw pinch branding
โ โโโ ml_context.py # Token-budgeted for LLM tool results
โ โโโ github.py # SARIF file + markdown summary for GitHub Actions
โ โโโ cyclonedx.py # CycloneDX 1.5 SBOM
โโโ tests/
โโโ test_scanner.py # Core scanner + determinism
โโโ test_guards.py # Deterministic guard stack
โโโ test_cli.py # CLI integration
โโโ test_config.py # Config file parsing
โโโ test_config_integration.py # Config + scan integration
โโโ test_docs.py # Rule documentation completeness
โโโ test_init_and_sarif.py # Init command + SARIF format
โโโ test_license.py # License compliance
โโโ test_pnpm_lock_parser.py # pnpm lockfile parser
โโโ test_sideloading.py # Protocol sideloading
โโโ test_benchmark.py # Performance benchmarks
โโโ fixtures/ # Test projects (IoC regression suite)
License & Attestations
- License: Business Source License 1.1 (BUSL-1.1) (LICENSE); commercial use requiring a license โ see COMMERCIAL-LICENSE.md
- SCAAT: SCAAT.md โ Supply Chain Attacks and Threats mapping
- SLSA: SLSA.md โ SLSA Build L3 roadmap
- Security: SECURITY.md โ vulnerability reporting
- Citation: CITATION.cff โ academic citation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file picosentry-0.16.0.tar.gz.
File metadata
- Download URL: picosentry-0.16.0.tar.gz
- Upload date:
- Size: 338.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc7ebdf28913504eaf379756bc670ef0c80d2bebca0694c036b21330d19a1c45
|
|
| MD5 |
cc00291ab32d70447cd0d080be674fc4
|
|
| BLAKE2b-256 |
5b8da632c295d966e1279fa11a15e9b22ff097660ff6987a0c9148033ef6bdbb
|
File details
Details for the file picosentry-0.16.0-py3-none-any.whl.
File metadata
- Download URL: picosentry-0.16.0-py3-none-any.whl
- Upload date:
- Size: 227.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d069f98482a33b3afb08b98f0139ccc7465a0754e81150a8f39d438bbe3f80a
|
|
| MD5 |
0aad76517bc87bdaee47d538045a3c8a
|
|
| BLAKE2b-256 |
6d2a53beb0995dbb4cd5b894a6733c1735fd794c812a03ce22859e9ffcd65cc2
|