Skip to main content

PicoSentry โ€” deterministic supply-chain scanner for npm/pnpm, safe for ML pipelines

Project description

PicoSentry ๐Ÿฆž

Deterministic, offline supply-chain scanner for npm/pnpm โ€” safe for ML pipelines.

CI PyPI Python Tests Rules Deterministic SLSA L3 License: BUSL-1.1

PicoSentry

Same inputs + same corpus version = same findings and scan fingerprint. Every time.

No HTTP at scan time. No probabilistic heuristics. No narrative in findings.

Note on determinism: Default JSON output includes audit timestamps and timing data (useful for forensics). For byte-identical output across runs, use --deterministic-output or --verify-determinism, which omit timestamps and timing for reproducible CI artifacts.

Quick Start

# Install
pip install picosentry

# Or from source
pip install -e .

# Or use Docker
docker build -t picosentry .
docker run --rm -v $(pwd):/scan picosentry scan /scan

# Scan a project
picosentry scan ./my-project

# CI-friendly health check (exit 1 on HIGH+CRITICAL only)
picosentry check ./my-project --fail-on high

# JSON output (deterministic, sorted keys)
picosentry scan ./my-project --format json

# Monorepo workspace scan
picosentry workspace . --format table

# CycloneDX SBOM
picosentry scan ./my-project --format cyclonedx

# Verify determinism โ€” run twice, assert SHA-256 match (exit 0=match, 4=violation)
picosentry scan ./my-project --verify-determinism

# Manage custom IoC corpus packs
picosentry corpus export ./my-iocs.json
picosentry corpus import ./community-pack.json
picosentry corpus list

# Manage custom IoCs
picosentry ioc register ./suspicious-pkg.json
picosentry ioc list

CLI Reference

picosentry scan <target> [OPTIONS]     Scan a project directory
picosentry check <target> [OPTIONS]    CI-optimized health check (exit-code only)
picosentry workspace <root> [OPTIONS]  Scan entire monorepo (discovers all npm/pnpm projects)
picosentry corpus export <output>      Export custom IoCs as a shareable pack
picosentry corpus import <path>        Import a corpus pack into your IoC registry
picosentry corpus validate <path>      Validate a corpus pack without importing
picosentry corpus list                 List available corpus packs
picosentry ioc register <path>         Register a custom IoC indicator
picosentry ioc list                    List user-registered custom IoCs
picosentry ioc remove <id>             Remove a custom IoC by ID
picosentry rules [--json]              List available detector rules
picosentry version                     Show version, corpus version, rule count
picosentry diff <a.json> <b.json>      Compare two scan files for determinism
picosentry init [target] [--force]     Generate .picosentry.yml template
picosentry update [--top N]            Download latest npm corpus (requires network)

Scan Options:
  --format, -f        json, sarif, table, ml-context, github, cyclonedx (default: table)
  --output, -o        Write output to file instead of stdout
  --rules, -r         Run only specific rules (e.g., L2-POST-001 L2-TYPO-001)
  --corpus, -c        Path to corpus directory (default: built-in)
  --no-color          Disable colored output (table format only)
  --token-budget      Token budget for ml-context format (default: 4096)
  --exit-code         Exit with code 1 if findings found
  --fail-on           Exit 1 only if findings at or above severity (implies --exit-code)
  --quiet, -q         Summary only, no detailed findings
  --summary           One-line summary for CI notifications
  --baseline, -b      Path to baseline JSON or ignore file
  --baseline-update    Write updated baseline after filtering
  --verbose, -v       Show per-rule timing and scan details on stderr
  --timeout           Scan timeout in seconds (0 = no timeout, exit code 3 on timeout)
  --log-format        Log output format: text (default) or json for SIEM integration
  --fail-on-rule-error  Exit code 4 if any detector rule raises an exception (fail-closed)
  --verify-determinism Run scan twice and verify SHA-256 determinism
  --deterministic-output  Omit timestamps and timing for byte-stable JSON output
  --sarif-file        Path for SARIF output file for --format github (default: sarif.json)

Claw Pinch Branding

Human-facing table output uses lobster-themed severity labels:

Standard Severity PicoSentry Label
CRITICAL / HIGH HARD PINCH ๐Ÿฆž
MEDIUM SOFT PINCH
LOW / INFO NUDGE

Clean scan: "No pinches. All clear. ๐Ÿฆž"

Machine formats (JSON, SARIF, CycloneDX, ml-context) use standard severity labels for CI/CD compatibility.

Design Principles

  1. Deterministic by construction: sha256(scan_a) == sha256(scan_b) on identical inputs + corpus version
  2. Offline at scan time: No HTTP calls during scanning. Corpus is local and versioned.
  3. Pure functions: Rules are (target_path, corpus_dir) โ†’ List[Finding]. No global state, no randomness.
  4. No narrative in findings: Output is structured data. The consumer formats.
  5. ML-safe: --format ml-context produces compact, token-budgeted output designed for LLM tool results.
  6. SBOM-first: --format cyclonedx produces a full CycloneDX 1.5 SBOM with component inventory, purl, and hash verification.
  7. CI/CD native: Sigstore-signed releases, GitHub Actions, pre-commit hooks, workspace scanning, --fail-on-rule-error.

Configuration File

PicoSentry reads .picosentry.yml from the target directory (or .picosentry.yaml / picosentry.config.yml). Config file values are defaults; CLI flags override them.

version: 1

# Output format: json, sarif, table, ml-context, github, cyclonedx
format: json

# Disable colored output
no_color: true

# Exit with code 1 if findings found
exit_code: true

# Only fail CI on HIGH or above
fail_on: high

# Suppress known findings from previous scan
baseline: baseline.json

# Severity overrides โ€” downgrade/upgrade rule severity
severity_overrides:
  L2-PROV-001: INFO        # Downgrade provenance to info
  L2-FORK-001: LOW         # Downgrade fork drift to low

# Token budget for ml-context format
token_budget: 2048

Workspace Scanning

Scan entire monorepos with one command. Supports pnpm workspaces, Nx, Turborepo, Lerna.

picosentry workspace . --format json
picosentry workspace . --fail-on high --quiet

Custom IoC Registry

Register your own indicators of compromise. Never leaves your machine.

# Register a custom IoC
picosentry ioc register ./suspicious-package.json

# Export for sharing across teams
picosentry corpus export ./team-iocs.json --name "acme-corp-iocs"

# Import community packs
picosentry corpus import ./community-threats.json

# Validate a pack before importing
picosentry corpus validate ./pack.json

Sigstore Verification

Every release is signed via Sigstore with OIDC identity from GitHub Actions.

# Verify a release artifact
./scripts/verify_release.sh v0.15.0

# Or manually
python -m sigstore verify identity \
  --cert-identity "https://github.com/KirkForge/PicoSentry/.github/workflows/release.yml@refs/tags/v0.15.0" \
  --cert-oidc-issuer "https://token.actions.githubusercontent.com" \
  picosentry-0.15.0-py3-none-any.whl

Pre-commit Hooks

Drop into any npm/pnpm project's .pre-commit-config.yaml:

repos:
  - repo: https://github.com/KirkForge/PicoSentry
    rev: v0.15.0
    hooks:
      - id: picosentry-scan       # Full 19-rule scan
      - id: picosentry-check      # Fast CI check (HIGH+CRITICAL only)
      - id: picosentry-workspace  # Monorepo scan

Supply Chain Attack Coverage (21 Rules)

Rule ID Attack Vector Severity
L2-POST-001 Post-install scripts HIGH
L2-OBFS-001 eval / Function obfuscation HIGH
L2-OBFS-002 Hex-encoded payloads MEDIUM
L2-OBFS-003 base64+eval obfuscation MEDIUM
L2-OBFS-004 Unicode escape obfuscation LOW
L2-DEPC-001 Dependency confusion HIGH
L2-TYPO-001 Typosquatting HIGH
L2-MANI-001 Manifest tampering HIGH
L2-MANI-002 Optional deps with scripts MEDIUM
L2-FORK-001 Fork drift HIGH
L2-CRED-001 Credential / secret leak HIGH
L2-LOCK-001 Lockfile drift MEDIUM
L2-BUND-001 Bundled shadow dependencies HIGH
L2-PROV-001 Provenance / integrity MEDIUM
L2-MAINT-001 Maintainer change / takeover HIGH
L2-PNPM-001 pnpm dangerous config HIGH
L2-LICENSE-001 License issues MEDIUM
L2-ENGIN-001 Engine issues LOW
L2-SIDELOAD-001 Protocol sideloading MEDIUM
L2-IOC-001 Custom IoC detection INFOโ€“CRITICAL
L2-ADV-001 Advisory vulnerability (OSV/GHSA/npm) MEDIUMโ€“CRITICAL

See SCAAT.md for the full attack-vector-to-rule mapping with confidence levels.

Deterministic Guard Stack

PicoSentry enforces determinism at four layers:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Layer 4: CI Gate                       โ”‚
โ”‚  --verify-determinism (CLI)             โ”‚
โ”‚  Runs scan twice, asserts SHA-256 match โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 3: Diff                          โ”‚
โ”‚  picosentry diff a.json b.json          โ”‚
โ”‚  Compare two saved scans field-by-field โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 2: Guard (runtime)               โ”‚
โ”‚  Validates invariants after each scan   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Layer 1: Models (structural)           โ”‚
โ”‚  Frozen dataclasses, sorted keys        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Architecture

picosentry/
โ”œโ”€โ”€ __init__.py          # Public API + __version__
โ”œโ”€โ”€ __main__.py          # python -m picosentry
โ”œโ”€โ”€ cli.py               # CLI entry point (scan, check, workspace, corpus, rules, diff, init, update)
โ”œโ”€โ”€ engine.py            # ScanEngine orchestrator (per-rule timing, corpus resolution)
โ”œโ”€โ”€ guards.py            # Deterministic guard stack (enforcement + verification)
โ”œโ”€โ”€ models.py            # Finding, ScanResult, ScanStats, BaselineResult (frozen dataclasses)
โ”œโ”€โ”€ config.py            # .picosentry.yml loader + merge
โ”œโ”€โ”€ logging.py           # Structured JSON logging for SIEM
โ”œโ”€โ”€ workspace.py         # Multi-project/monorepo workspace scanning
โ”œโ”€โ”€ ioc_registry.py      # Custom IoC registration + management
โ”œโ”€โ”€ corpus_share.py      # Corpus pack export/import/validate (marketplace)
โ”œโ”€โ”€ corpus/
โ”‚   โ”œโ”€โ”€ npm_top_packages.json  # 327 top npm packages (typosquat targets)
โ”‚   โ””โ”€โ”€ ioc/                    # IoC metadata (event-stream, Shai-Hulud, left-pad, etc.)
โ”œโ”€โ”€ rules/
โ”‚   โ”œโ”€โ”€ post_install.py   # L2-POST-001
โ”‚   โ”œโ”€โ”€ obfuscation.py    # L2-OBFS-001..004
โ”‚   โ”œโ”€โ”€ dep_confusion.py  # L2-DEPC-001
โ”‚   โ”œโ”€โ”€ typosquat.py      # L2-TYPO-001
โ”‚   โ”œโ”€โ”€ manifest.py       # L2-MANI-001/002
โ”‚   โ”œโ”€โ”€ fork_drift.py     # L2-FORK-001
โ”‚   โ”œโ”€โ”€ credential_read.py  # L2-CRED-001
โ”‚   โ”œโ”€โ”€ pnpm_lock_parser.py # pnpm-lock.yaml v6+ parser
โ”‚   โ”œโ”€โ”€ lockfile_drift.py   # L2-LOCK-001
โ”‚   โ”œโ”€โ”€ bundled_shadow.py   # L2-BUND-001
โ”‚   โ”œโ”€โ”€ provenance.py       # L2-PROV-001
โ”‚   โ”œโ”€โ”€ maintainer_change.py # L2-MAINT-001
โ”‚   โ”œโ”€โ”€ pnpm_config.py      # L2-PNPM-001
โ”‚   โ”œโ”€โ”€ license.py           # L2-LICENSE-001
โ”‚   โ”œโ”€โ”€ engine.py            # L2-ENGIN-001
โ”‚   โ””โ”€โ”€ sideloading.py       # L2-SIDELOAD-001
โ”œโ”€โ”€ formatters/
โ”‚   โ”œโ”€โ”€ json_fmt.py       # Deterministic JSON (sorted keys)
โ”‚   โ”œโ”€โ”€ sarif.py          # SARIF 2.1.0
โ”‚   โ”œโ”€โ”€ table.py          # Human-readable with claw pinch branding
โ”‚   โ”œโ”€โ”€ ml_context.py     # Token-budgeted for LLM tool results
โ”‚   โ”œโ”€โ”€ github.py         # SARIF file + markdown summary for GitHub Actions
โ”‚   โ””โ”€โ”€ cyclonedx.py      # CycloneDX 1.5 SBOM
โ””โ”€โ”€ tests/
    โ”œโ”€โ”€ test_scanner.py          # Core scanner + determinism
    โ”œโ”€โ”€ test_guards.py           # Deterministic guard stack
    โ”œโ”€โ”€ test_cli.py              # CLI integration
    โ”œโ”€โ”€ test_config.py           # Config file parsing
    โ”œโ”€โ”€ test_config_integration.py # Config + scan integration
    โ”œโ”€โ”€ test_docs.py             # Rule documentation completeness
    โ”œโ”€โ”€ test_init_and_sarif.py    # Init command + SARIF format
    โ”œโ”€โ”€ test_license.py           # License compliance
    โ”œโ”€โ”€ test_pnpm_lock_parser.py  # pnpm lockfile parser
    โ”œโ”€โ”€ test_sideloading.py       # Protocol sideloading
    โ”œโ”€โ”€ test_benchmark.py         # Performance benchmarks
    โ””โ”€โ”€ fixtures/                 # Test projects (IoC regression suite)

License & Attestations

  • License: Business Source License 1.1 (BUSL-1.1) (LICENSE); commercial use requiring a license โ€” see COMMERCIAL-LICENSE.md
  • SCAAT: SCAAT.md โ€” Supply Chain Attacks and Threats mapping
  • SLSA: SLSA.md โ€” SLSA Build L3 roadmap
  • Security: SECURITY.md โ€” vulnerability reporting
  • Citation: CITATION.cff โ€” academic citation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

picosentry-0.16.0.tar.gz (338.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

picosentry-0.16.0-py3-none-any.whl (227.1 kB view details)

Uploaded Python 3

File details

Details for the file picosentry-0.16.0.tar.gz.

File metadata

  • Download URL: picosentry-0.16.0.tar.gz
  • Upload date:
  • Size: 338.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for picosentry-0.16.0.tar.gz
Algorithm Hash digest
SHA256 bc7ebdf28913504eaf379756bc670ef0c80d2bebca0694c036b21330d19a1c45
MD5 cc00291ab32d70447cd0d080be674fc4
BLAKE2b-256 5b8da632c295d966e1279fa11a15e9b22ff097660ff6987a0c9148033ef6bdbb

See more details on using hashes here.

File details

Details for the file picosentry-0.16.0-py3-none-any.whl.

File metadata

  • Download URL: picosentry-0.16.0-py3-none-any.whl
  • Upload date:
  • Size: 227.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for picosentry-0.16.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7d069f98482a33b3afb08b98f0139ccc7465a0754e81150a8f39d438bbe3f80a
MD5 0aad76517bc87bdaee47d538045a3c8a
BLAKE2b-256 6d2a53beb0995dbb4cd5b894a6733c1735fd794c812a03ce22859e9ffcd65cc2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page