Skip to main content

Offline-first Python package intelligence and supply-chain security decision support.

Project description

pkgwhy

pip install pkgwhy

Know why a package exists before you or your agent trusts it.

pkgwhy is an offline-first Python package intelligence, agent policy, and local private-tool CLI. It explains installed packages, inspects local package files without importing them, reports conservative vulnerability, provenance, and static security signals, produces agent-readable JSON judgements, and can publish and run local private Python tools from a local registry.

Status

pkgwhy 1.8.0 is a Python supply-chain security decision-support tool and local pip/CI install gate for developers and AI agents. It is useful for safe metadata extraction, local package inspection, conservative static package review, agent-readable JSON, vulnerability and provenance foundations, policy checks, artifact precheck, guarded pip installs, local CI package gates, local review history and reports, local registry trust states, local tool validation, agent check dispatching, and the local private-registry and runner MVP.

It is not a production security scanner, not malware-detection certainty, and not a full sandbox. Results are evidence and signals for review, not proof that a package is safe or malicious.

Current packaged version: 1.8.0.

What Works Now

The current release includes installed package intelligence:

pkgwhy scan
pkgwhy explain typer
pkgwhy why typer
pkgwhy inspect typer
pkgwhy metadata typer --installed --json
pkgwhy metadata ./dist/example-1.0.0-py3-none-any.whl --json
pkgwhy judge typer --json
pkgwhy precheck typer --json
pkgwhy precheck "typer==0.12.5" --json
pkgwhy precheck -r requirements.txt --json
pkgwhy precheck pyproject.toml --json
pkgwhy policy init
pkgwhy policy show --json
pkgwhy review list --json
pkgwhy report --markdown pkgwhy-report.md --json pkgwhy-report.json
pkgwhy ci --requirements requirements.txt --mode strict --markdown pkgwhy-ci.md --json pkgwhy-ci.json
pkgwhy precheck annotated-doc==0.0.4 --download-artifacts --json
pkgwhy pip install typer --dry-run
pkgwhy pip install -r requirements.txt --dry-run
pkgwhy agent policy --json
pkgwhy agent precheck typer --json
pkgwhy agent check typer --json
pkgwhy risk typer
pkgwhy audit --limit 5 --json
pkgwhy audit --limit 5 --json --vulnerability-file ./osv-fixture.json
# "reqeusts" is intentionally misspelled to demonstrate typo detection.
pkgwhy typos reqeusts pandas-stubs

Implemented capabilities include:

  • Safe metadata extraction with pkgwhy metadata for installed distributions, .dist-info and .egg-info directories, wheel files, source archives, unpacked distribution metadata, development checkouts, PKG-INFO, METADATA, and pyproject.toml.
  • Metadata output formats for JSON, human-readable terminal output, CSV, and INI-style field output.
  • Installed distribution metadata using importlib.metadata.
  • Package explanation from local knowledge and installed metadata.
  • Direct, transitive, imported, unknown, and not-installed dependency status.
  • Simple requirements.txt, pyproject.toml, uv.lock, and poetry.lock dependency reasoning.
  • Installed package size and largest-file reporting.
  • Source availability and coarse readability signals.
  • JavaScript readability, minification, source-map, encoded-payload, dynamic-execution, and obfuscation-like static signals.
  • Native compiled file, executable, WASM, shell script, package-manager, setup/install-time, and CLI-entrypoint signals from file metadata.
  • AST-only Python source scanning with file/line evidence for filesystem, network, subprocess, environment, credential-pattern, dynamic-code, dynamic-import, deserialisation, unsafe YAML load, package-manager, and encoded-payload signals.
  • URL/domain extraction from small source/text files as evidence, not proof of network behavior.
  • Conservative credential-like assignment detection with suspicious values masked in output.
  • Typosquatting similarity signals with false-positive guards for common ecosystem package families.
  • Optional OSV-like vulnerability record parsing from local JSON files.
  • Explicit opt-in OSV.dev query boundary for known-vulnerability lookup, with a local response cache and stale-cache fallback when a requested online lookup is unavailable.
  • Conservative version matching for affected and fixed version ranges.
  • Metadata-derived provenance/source-trust fields from installed metadata and optional PyPI JSON, with unsupported attestation and trusted-publishing checks marked as unknown or not implemented.
  • Conservative risk level, decision, warning, recommendation, evidence, confidence, risk model version, and rule-ID output.
  • Human inspect, risk, and judge reports that show compact rule-evidence summaries, while JSON reports include structured rule details.
  • Pre-install package precheck for package requirements, requirements files, and pyproject.toml dependencies.
  • Explicit artifact-download precheck that downloads a PyPI wheel or source artifact to a temporary review directory, verifies SHA-256 when available, extracts safely, statically inspects files, and deletes temporary files by default.
  • Optional gate exit codes via pkgwhy precheck --enforce-exit-code.
  • Guarded pip install flow via pkgwhy pip install, with precheck first, stable exit codes, explicit overrides, and compact local decision logs.
  • Local policy commands for conservative team policy files: pkgwhy policy init, pkgwhy policy show, and pkgwhy policy validate.
  • Optional local review records with --save-review, plus pkgwhy review list, pkgwhy review show, and pkgwhy report.
  • Local pkgwhy ci command with advisory, strict, and agent modes, JSON and Markdown outputs, and no GitHub or cloud requirement.
  • Reusable GitHub Actions package-gate template with advisory, strict, and agent modes.
  • pkgwhy beta and pkgwhy feedback commands that print links only; they do not send telemetry or make network calls.
  • Local registry trust states for private tools: trusted, reviewed, quarantined, blocked, and unknown.
  • Local private-tool source validation with pkgwhy tool validate <path> --json.
  • Agent check dispatching with pkgwhy agent check <target> --json for package specs, dependency files, pyproject-style TOML, and local tool folders/scripts.
  • Commercial and agent platform architecture documentation for future local policy packs, team review, hosted evidence cache, shared organization policy, and agent install gateway.
  • Stable JSON output for agent workflows.
  • Schema-versioned agent policy and package precheck output.
  • Conservative non-interactive agent defaults that block unknown or high-risk package use until a human reviews the evidence.
  • Compact local agent decision logs that omit full package evidence.

The local private-tool MVP supports a local registry, local publishing, tool judgement, and controlled local execution:

pkgwhy registry init ~/.pkgwhy/registry
pkgwhy registry list
pkgwhy registry add local-copy ~/.pkgwhy/registry
pkgwhy registry use local
pkgwhy publish ./my_tool.py
pkgwhy registry trust local/my_tool
pkgwhy registry quarantine local/my_tool
pkgwhy registry blocked
pkgwhy tool inspect local/my_tool
pkgwhy tool judge local/my_tool --json
pkgwhy tool validate ./my-tool-folder --json
pkgwhy run local/my_tool
pkgwhy run local/my_tool --non-interactive

pkgwhy run resolves tools only from the configured local registry, verifies the stored bundle hash before execution, runs simple Python-script entrypoints in a per-tool virtual environment, and writes execution metadata logs under the registry directory.

Every run prints this warning:

This run uses a Python virtual environment for dependency isolation. It does not fully sandbox operating-system permissions.

Current runner policy is intentionally conservative:

  • Unknown tools are not resolved or run; a valid local registry entry is required.
  • Duplicate owner/name/version publishes are blocked instead of silently replacing a registry entry.
  • Corrupt registry indexes fail closed for publish and tool-judgement paths.
  • Symlinks are not bundled, and stored registry paths must remain inside the registry root.
  • Bundle hash mismatch or a missing bundle blocks execution.
  • Quarantined or blocked registry trust states block execution.
  • sandbox_only and block tool judgements block execution.
  • Non-interactive execution is blocked unless both the judgement and manifest agent policy allow it.
  • Tools with declared dependencies are not run yet because dependency installation is not implemented.
  • Unsupported entrypoints, including shell scripts, absolute paths, and path traversal, are rejected.
  • Tool signatures report not_implemented; unsigned local tools are a manual-review signal, not a verified trust claim.
  • Successful run logs include the pre-run policy decision, reasons, and warnings.

What Is Not Implemented Yet

These are roadmap items, not current features:

  • Complete vulnerability database coverage, transitive vulnerability analysis, or guaranteed advisory freshness.
  • Default online vulnerability lookup. Network access is only used when explicitly requested.
  • Persistent cached PyPI/source lookup beyond the current OSV response cache.
  • Cloud/private remote registry backends.
  • pull, mirroring, and remote synchronization.
  • Tool dependency installation in the runner.
  • Tool bundle signing and signature verification.
  • Tool lock/verify commands and registry export/import.
  • Dynamic sandbox analysis for arbitrary packages.
  • Tool-specific pkgwhy agent judge expansion beyond the current package precheck path.
  • pkgwhy agent explain-decision <review-id>.
  • Cloud/model-backed review.
  • Billing, API keys, team plans, or enterprise deployment.
  • OS-level sandboxing or container isolation.
  • Production security-tool guarantees.

Install

Install from PyPI after the release is published:

python -m pip install pkgwhy

For local development from this repository:

python -m pip install -e ".[dev]"
pkgwhy --help

Install directly from GitHub after the repository is public, or from an authorized private checkout:

python -m pip install "pkgwhy @ git+https://github.com/devlukeg/pkgwhy.git"

Runtime dependencies are intentionally small:

  • typer for the command-line interface.
  • rich for terminal tables and formatted human output.
  • pydantic for stable structured judgement, manifest, registry, and report models.
  • packaging for dependency and requirement parsing.

Development-only dependencies are pytest, build, and twine.

Quickstart

Inspect an installed package:

pkgwhy inspect typer

Explain why it may be present:

pkgwhy why typer

Emit machine-readable judgement JSON:

pkgwhy judge typer --json

Extract package metadata without importing or executing package code:

pkgwhy metadata typer --installed --json
pkgwhy metadata ./dist/example-1.0.0-py3-none-any.whl --format ini --field Name --field Version
pkgwhy metadata pyproject.toml --format csv

pkgwhy metadata is a safe metadata reader, not a trust verdict. It reads package metadata files and archives with bounded parsing and path-traversal checks, then leaves risk, provenance, vulnerability, policy, and static-analysis decisions to precheck, judge, audit, and ci.

Before installing a package, run a local precheck:

pkgwhy precheck typer --json
pkgwhy precheck "typer==0.12.5" --json
pkgwhy precheck -r requirements.txt --json
pkgwhy precheck pyproject.toml --json

By default, pkgwhy precheck uses local installed metadata when available and does not use the network. Add --pypi or --osv only when you want explicit online enrichment. Add --download-artifacts to query PyPI, download one wheel or source artifact, verify its SHA-256 when PyPI metadata provides it, extract it to a temporary review directory, inspect it statically, and delete the temporary files. The command does not install, import, or execute inspected package code.

For CI or install-gate usage, add --enforce-exit-code:

pkgwhy precheck typer --json --enforce-exit-code

Exit codes are:

  • 0: allow.
  • 1: allow with caution or manual review.
  • 2: block or sandbox-only.
  • 4: requested online/artifact lookup was unavailable or failed.

To gate an install, use pkgwhy pip install instead of raw pip install:

pkgwhy pip install typer
pkgwhy pip install typer --policy strict
pkgwhy pip install -r requirements.txt
pkgwhy pip install typer --dry-run --json

The pip gate always runs precheck first. It invokes pip only when the precheck policy allows the install, or when a human uses an explicit override flag. Review/caution results exit 1, block/sandbox-only results exit 2, tool/config errors exit 3, and unavailable or incomplete requested evidence exits 4. --dry-run evaluates the gate and writes the local decision log without invoking pip.

Overrides are explicit:

pkgwhy pip install typer --override-review --override-reason "reviewed local evidence"
pkgwhy pip install suspicious-name --override-block --override-reason "temporary isolated test"

Agents and automated workflows should use pkgwhy pip install when dependency policy is required, not raw pip install. The command is still decision support: it does not prove a package is safe and it does not sandbox pip or installed package code.

Local team policy and review history are filesystem features:

pkgwhy policy init
pkgwhy policy show --json
pkgwhy precheck typer --json --save-review
pkgwhy pip install typer --dry-run --json --save-review
pkgwhy audit --limit 5 --json --save-review
pkgwhy review list --json
pkgwhy report --markdown pkgwhy-report.md --json pkgwhy-report.json

The default policy path is .pkgwhy/policy.toml, and saved reviews default to .pkgwhy/reviews/. These files are local records, not cloud sync, hosted review, or tamper-proof audit logs.

For local CI without a hosted service:

pkgwhy ci --requirements requirements.txt --mode advisory --markdown pkgwhy-ci.md --json pkgwhy-ci.json
pkgwhy ci --requirements requirements.txt --mode strict
pkgwhy ci --requirements requirements.txt --mode agent

Advisory mode reports without failing on package review decisions. Strict mode fails on blocked or high-risk policy decisions. Agent mode is conservative for non-interactive agent workflows.

For CI package gates, start from the reusable GitHub Actions template:

examples/github-actions/pkgwhy-package-gate.yml

See CI Templates for advisory, strict, and agent-mode usage. The template does not require secrets or a hosted pkgwhy service.

For future Team/Cloud interest and product feedback:

pkgwhy beta
pkgwhy feedback

These commands only print links. They do not send telemetry, make network calls, create accounts, or enable hosted review.

Inspect the default agent policy and run a conservative agent precheck:

pkgwhy agent policy --json
pkgwhy agent precheck typer --json
pkgwhy agent judge typer --json
pkgwhy agent check typer --json
pkgwhy agent check requirements.txt --json
pkgwhy agent check ./my-tool-folder --json

pkgwhy agent precheck applies policy to the package judgement. In the default non-interactive mode, unknown and high-risk package decisions are blocked until a human reviews the judgement evidence. pkgwhy agent check dispatches the target to the safest matching local check and returns one normalized decision envelope. These commands do not install, import, or execute inspected package code. Agent precheck writes a compact local decision log under the user config directory when that directory is writable.

Run a conservative risk report:

pkgwhy risk typer

Audit a small slice of the current environment:

pkgwhy audit --limit 2 --json

Include controlled local OSV-like vulnerability data in an audit:

pkgwhy audit --limit 2 --json --vulnerability-file ./osv-response.json

Query OSV.dev explicitly during an audit:

pkgwhy audit --limit 2 --json --osv

Use a specific OSV cache directory:

pkgwhy audit --limit 2 --json --osv --osv-cache-dir ./.pkgwhy-osv-cache

Query PyPI JSON explicitly for provenance metadata during an audit:

pkgwhy audit --limit 2 --json --pypi

Vulnerability data can be incomplete or unavailable. pkgwhy reports source-attributed matches and fixed versions only when the supplied advisory data contains them. Cached OSV responses can be stale, and missing vulnerability matches are not evidence that a package is safe.

Check package names for typosquatting similarity signals:

pkgwhy typos reqeusts pandas-stubs

Create and select a local registry:

pkgwhy registry init ~/.pkgwhy/registry
pkgwhy registry list
pkgwhy registry use local

Add an existing local registry directory:

pkgwhy registry add work-tools ~/.pkgwhy/work-tools-registry

Publish and inspect a local Python script:

pkgwhy publish ./my_tool.py
pkgwhy tool inspect local/my_tool
pkgwhy tool judge local/my_tool --json

Publish a folder with an explicit pkgwhy.toml manifest:

[tool]
name = "my-tool"
owner = "local"
version = "0.1.0"
description = "Local Python tool."
artifact_type = "folder"
entrypoint = "main.py"
python_requires = ">=3.11"
dependencies = []
declared_permissions = ["filesystem"]

[security]
requires_human_approval = true
allow_unsigned = false
allow_unpinned_dependencies = false
signing_status = "not_implemented"

[agent]
default_decision = "review_manually"
non_interactive_decision = "review_manually"
pkgwhy publish ./my-tool-folder

Run a local private tool after hash verification and policy checks:

pkgwhy run local/my_tool

Apply the stricter non-interactive runner policy:

pkgwhy run local/my_tool --non-interactive

Agent Integration Contract

Compatibility policy: docs/json-schema-compatibility.md.

Decision-oriented JSON commands expose a shared top-level contract where the field applies: schema_version, command, target, target_type, decision, risk_level, confidence, recommended_next_action, exit_code, exit_code_meaning, warnings, evidence, evidence_summary, source_freshness, policy, and errors for JSON error objects. Existing command-specific fields remain available.

Exit code meanings are stable for agent consumers:

  • 0: allowed or completed successfully.
  • 1: review or caution required before proceeding.
  • 2: blocked by policy or risk decision.
  • 3: tool, configuration, or user input error.
  • 4: external data unavailable or evidence incomplete.

Decision meanings:

  • allow: proceed under normal review practices.
  • allow_with_caution: proceed only after reviewing warnings and evidence.
  • review_manually: ask a human to review before proceeding.
  • sandbox_only: use only inside a real sandbox; a Python virtual environment is not a full OS sandbox.
  • block: do not install, import, or run unless a human approves a policy exception.

When --json is set and a handled user/configuration error occurs, commands emit pkgwhy.error.v1 with error_type, message, exit_code, exit_code_meaning, suggested_fix, command, target, and target_type where available.

Batch precheck JSON includes top-level blocking_targets, review_targets, allowed_targets, and aggregate_recommendation so automation can see the package-level reason without walking every nested result. Full evidence remains available, and evidence_summary gives compact counts, top evidence, top warnings, and top rule IDs for agents that need a smaller decision surface.

The pip gate safety model is unchanged: pkgwhy pip install runs precheck before pip and does not invoke pip in dry-run mode, for blocked decisions, or when required evidence is unavailable. It does not sandbox pip or installed package code.

Local registry trust state now affects tool judgement. blocked and quarantined tools return blocking tool judge decisions. trusted and reviewed are local human labels; they do not override hash mismatch, signature status, manifest approval requirements, or static capability warnings.

Package judgement schema version: pkgwhy.package_judgement.v1.

Field shape for pkgwhy judge <package> --json:

{
  "schema_version": "pkgwhy.package_judgement.v1",
  "risk_model_version": "pkgwhy.risk_model.v1",
  "package": "package-name",
  "version": "installed-version-or-null",
  "decision": "allow_with_caution",
  "risk_level": "medium",
  "confidence": "medium",
  "summary": "summary from installed metadata or local explanation sources",
  "source_availability": "installed_source_present",
  "installed_size_bytes": 0,
  "detected_capabilities": [],
  "warnings": [],
  "recommendation": "conservative recommendation text",
  "evidence": [],
  "risk_rules": [],
  "known_vulnerabilities": [],
  "provenance": {
    "package": "package-name",
    "version": "installed-version-or-null",
    "repository_url": null,
    "documentation_url": null,
    "homepage_url": null,
    "project_urls": {},
    "metadata_source": "installed_distribution_metadata",
    "source_distribution_status": "unknown",
    "trusted_publishing_status": "unknown",
    "attestation_status": "not_implemented",
    "release_activity_status": "unknown",
    "confidence": "low",
    "warnings": [],
    "evidence": []
  },
  "capability_exposure_note": "Python packages run with the same permissions as the Python process. This analysis detects capabilities used or referenced by package code and metadata; static signals are not proof of runtime behavior or intent."
}

Values are environment-specific. Run pkgwhy judge <installed-package> --json locally for actual installed-package evidence.

Agent policy schema version: pkgwhy.agent_policy.v1.

Field shape for pkgwhy agent policy --json:

{
  "schema_version": "pkgwhy.agent_policy.v1",
  "allow_public_pypi": false,
  "allow_unpinned_dependencies": false,
  "allow_unsigned_tools": false,
  "require_pkgwhy_judgement": true,
  "require_hash_verification": true,
  "require_signature_verification": false,
  "non_interactive_default_decision": "block",
  "unknown_package_decision": "review_manually",
  "high_risk_package_decision": "review_manually",
  "critical_risk_package_decision": "block",
  "non_interactive_unknown_package_decision": "block",
  "non_interactive_high_risk_package_decision": "block",
  "non_interactive_critical_risk_package_decision": "block",
  "tool_execution_requires_local_registry": true,
  "dynamic_analysis_default_decision": "block"
}

Agent package precheck schema version: pkgwhy.agent_package_precheck.v1.

Field shape for pkgwhy agent precheck <package> --json:

{
  "schema_version": "pkgwhy.agent_package_precheck.v1",
  "policy_schema_version": "pkgwhy.agent_policy.v1",
  "package": "package-name",
  "version": "installed-version-or-null",
  "target_type": "package",
  "non_interactive": true,
  "decision": "block",
  "risk_level": "unknown",
  "confidence": "low",
  "policy_decision_source": "agent_policy",
  "reasons": [],
  "warnings": [],
  "recommendation": "conservative recommendation text",
  "package_judgement": {
    "schema_version": "pkgwhy.package_judgement.v1"
  }
}

The embedded package_judgement contains the same package judgement shape as pkgwhy judge --json. Compact local agent decision logs use pkgwhy.agent_decision_log.v1 and intentionally omit the full judgement evidence.

Tool judgement schema version: pkgwhy.tool_judgement.v1.

Field shape for pkgwhy tool judge <tool> --json:

{
  "schema_version": "pkgwhy.tool_judgement.v1",
  "tool": "local/my_tool",
  "owner": "local",
  "name": "my_tool",
  "version": "0.1.0",
  "decision": "review_manually",
  "risk_level": "medium",
  "confidence": "medium",
  "reason": "Tool bundle hash matches the local registry index.",
  "requires_human_approval": true,
  "manifest": {
    "schema_version": "pkgwhy.tool_manifest.v1",
    "name": "my_tool",
    "owner": "local",
    "version": "0.1.0",
    "description": "Local Python script published with pkgwhy.",
    "artifact_type": "script",
    "entrypoint": "my_tool.py",
    "python_requires": ">=3.11",
    "dependencies": [],
    "declared_permissions": [],
    "security": {
      "requires_human_approval": true,
      "allow_unsigned": false,
      "allow_unpinned_dependencies": false,
      "signing_status": "not_implemented"
    },
    "agent": {
      "default_decision": "review_manually",
      "non_interactive_decision": "review_manually"
    }
  },
  "declared_permissions": [],
  "detected_capabilities": [],
  "hash_status": "verified",
  "signature_status": "not_implemented",
  "warnings": [
    "Signature verification is not implemented yet."
  ],
  "recommendation": "Review declared permissions and manifest metadata before running this private tool."
}

Tool validation schema version: pkgwhy.tool_validation.v1.

Field shape for pkgwhy tool validate ./folder --json:

{
  "schema_version": "pkgwhy.tool_validation.v1",
  "command": "pkgwhy tool validate",
  "target": "./folder",
  "target_type": "tool_folder",
  "valid": true,
  "decision": "allow",
  "risk_level": "low",
  "confidence": "high",
  "entrypoint": "main.py",
  "declared_permissions": [],
  "detected_capabilities": [],
  "issues": [],
  "errors": [],
  "warnings": [],
  "policy": {
    "executes_tool_code": false,
    "writes_to_registry": false,
    "symlinks_supported": false
  }
}

Tool validation reads local files and manifests, checks path boundaries and entrypoints, and statically analyzes files. It does not publish to a registry and does not execute tool code.

Agent check schema version: pkgwhy.agent_check.v1.

Field shape for pkgwhy agent check <target> --json:

{
  "schema_version": "pkgwhy.agent_check.v1",
  "command": "pkgwhy agent check",
  "target": "requirements.txt",
  "target_type": "requirements",
  "decision": "block",
  "risk_level": "high",
  "confidence": "medium",
  "recommended_next_action": "conservative next action text",
  "exit_code": 2,
  "exit_code_meaning": "blocked by policy or risk decision",
  "evidence_summary": {},
  "result_schema_version": "pkgwhy.precheck_batch.v1",
  "result": {
    "schema_version": "pkgwhy.precheck_batch.v1"
  }
}

pkgwhy agent check currently dispatches package specs to package precheck, requirements files and pyproject-style TOML files to batch precheck, and local tool folders/scripts to tool validation.

Supported package and tool decision values are:

  • allow
  • allow_with_caution
  • review_manually
  • sandbox_only
  • block

Supported risk levels are:

  • low
  • medium
  • high
  • critical
  • unknown

Security Model

pkgwhy is static and metadata-first. Package inspection reads metadata, files, text, and AST without importing or executing inspected packages.

Python packages do not have browser or mobile style permissions. They usually run with the same operating-system permissions as the Python process and user executing them. pkgwhy therefore reports capability exposure signals, not guaranteed permissions.

Examples of static signals:

  • A source file references subprocess.run.
  • A package declares console scripts.
  • Installed files include native compiled extensions.
  • AST parsing finds eval, exec, pickle.loads, or environment-variable access.
  • Source text contains URL/domain references.
  • Source text contains credential-like assignment names, with suspicious values masked.
  • JavaScript files appear minified, reference eval, reference atob, include source maps, or contain obfuscation-like patterns.

These signals can be legitimate. They are review prompts, not proof of malicious behavior.

Dynamic Sandbox Roadmap

Dynamic analysis has an experimental command skeleton, but it is not part of the stable security decision surface in this release. Dynamic analysis intentionally executes code, so it has a different safety boundary from static inspection.

The current release includes no arbitrary package dynamic analysis, no container backend, and no production sandboxing claim. The command remains as a safe-fail surface for the intended JSON shape and safety boundary.

The design is documented in docs/dynamic-sandbox.md. The key constraints are:

  • static package inspection remains the default;
  • unknown package code is not run on the host;
  • no arbitrary dynamic package installation, import, or CLI execution;
  • a future dynamic backend should use a disposable sandbox boundary;
  • network should be off by default;
  • filesystem access should default to a temporary scratch directory;
  • host secrets should not be inherited;
  • missing sandbox backends should fail safely instead of falling back to host execution;
  • empty event lists are not proof of safety.

The current command surface is a safe-fail skeleton:

pkgwhy dynamic --help
pkgwhy dynamic inspect --help
pkgwhy dynamic inspect demo-target --container --network off

Until a sandbox backend exists, pkgwhy dynamic inspect refuses to execute the target and reports that the backend is unavailable or blocked.

pkgwhy run is still a separate local private-tool execution path and is not dynamic package analysis.

Dynamic analysis result schema version: pkgwhy.dynamic_analysis.v1.

Field shape for pkgwhy dynamic inspect <target> --container --json while no backend is available:

{
  "schema_version": "pkgwhy.dynamic_analysis.v1",
  "target": "target-name",
  "mode": "inspect",
  "sandbox_backend": "container",
  "network_mode": "off",
  "filesystem_mode": "scratch",
  "status": "backend_unavailable",
  "warnings": [],
  "process_events": [],
  "filesystem_events": [],
  "network_events": [],
  "decision": "block",
  "limitations": []
}

Event lists are populated only when a future backend actually observes events. Empty event lists are not proof of safety.

Risk And Agent Decisions

Risk levels:

  • low
  • medium
  • high
  • critical
  • unknown

Agent decisions:

  • allow
  • allow_with_caution
  • review_manually
  • sandbox_only
  • block

The current risk engine is deliberately conservative and early. Treat it as decision support for humans and agents, not a final verdict.

Risk rule output includes risk_model_version and per-rule rule_id, category, severity, confidence, message, evidence, optional file path, optional line number, optional symbol, and false-positive notes. These rule IDs are a 1.0.0 compatibility surface; incompatible changes require changelog coverage and may require a schema or catalog version bump.

Detailed rule categories, corpus strategy, compatibility expectations, and false-positive/false-negative limitations are documented in docs/static-rule-corpus.md.

Current 1.0.0 rule IDs:

  • PKGWHY-VULN-001: known vulnerability advisory match.
  • PKGWHY-RISK-001: possible typosquatting similarity.
  • PKGWHY-RISK-002: unknown source availability from installed files.
  • PKGWHY-RISK-003: missing license metadata.
  • PKGWHY-RISK-004: native compiled code present.
  • PKGWHY-RISK-005: static capability signal.
  • PKGWHY-RISK-006: no installed files found through distribution metadata.
  • PKGWHY-PY-001: Python dynamic code execution reference.
  • PKGWHY-PY-002: Python dynamic import reference.
  • PKGWHY-PY-003: Python deserialisation-risk reference.
  • PKGWHY-PY-004: Python encoded-payload handling reference.
  • PKGWHY-PY-005: Python subprocess or shell execution reference.
  • PKGWHY-PY-006: Python environment or secret-like access reference.
  • PKGWHY-PY-007: Python package-manager manipulation reference.
  • PKGWHY-PY-008: Python unsafe YAML load reference.
  • PKGWHY-PY-009: Python obfuscation-bootstrap signal.
  • PKGWHY-BUILD-001: executable setup.py present.
  • PKGWHY-BUILD-002: setup-time subprocess or shell reference.
  • PKGWHY-BUILD-003: setup-time network reference.
  • PKGWHY-BUILD-004: setup-time dynamic execution reference.
  • PKGWHY-BUILD-005: build backend declared.
  • PKGWHY-BUILD-006: setup.cfg present.
  • PKGWHY-NET-001: source URL or domain reference.
  • PKGWHY-CRED-001: credential-like assignment with value masked.
  • PKGWHY-JS-001: JavaScript minification or density signal.
  • PKGWHY-JS-002: JavaScript dynamic execution reference.
  • PKGWHY-JS-003: JavaScript encoded-payload signal.
  • PKGWHY-JS-004: JavaScript obfuscation-like signal.
  • PKGWHY-JS-005: JavaScript source-map reference.
  • PKGWHY-BIN-001: native extension or library present.
  • PKGWHY-BIN-002: WASM binary present.
  • PKGWHY-BIN-003: native executable present.

Known-vulnerability output is source-attributed. A missing vulnerability match does not prove that a package has no vulnerabilities, because advisory databases and local fixtures can be incomplete or unavailable.

Native extensions, WASM files, minified JavaScript, URL references, and credential-like names are not automatically malicious. They are evidence for review, and the surrounding package purpose and source context still matter.

Private Registry And Agent Policy

pkgwhy is intended to grow into a private, security-aware executable layer for Python tools and AI-agent skills. The current MVP uses a local registry:

pkgwhy registry init ~/.pkgwhy/registry
pkgwhy publish ./my_tool.py
pkgwhy tool inspect local/my_tool
pkgwhy tool judge local/my_tool --json
pkgwhy run local/my_tool

The runner executes only tools resolved from the configured local registry. It does not run arbitrary public package code, does not install tool dependencies in the MVP, and blocks execution if the stored bundle hash does not verify. Local registry entries are file-backed records under the configured registry path; no cloud registry, account, upload, pull, or remote sync is implemented in this release.

The current release includes policy-as-code foundations for agents:

  • pkgwhy agent policy shows conservative default policy.
  • pkgwhy agent precheck <package> --json applies policy to package judgement JSON.
  • pkgwhy agent judge <package> --json is currently a package precheck alias.
  • Non-interactive package prechecks block unknown and high-risk package decisions by default.
  • Agent decision logs are local, compact, best-effort when the config directory is writable, and do not include full static evidence.
  • Local team policy files, review history, reports, and CI gates are filesystem-backed and do not require a hosted service.

The MVP runner uses Python virtual environments for dependency isolation. A virtual environment is not a full operating-system sandbox, and pkgwhy states that clearly before each run:

This run uses a Python virtual environment for dependency isolation. It does not fully sandbox operating-system permissions.

Signing is also not implemented yet, so JSON judgement reports signature_status: "not_implemented" rather than pretending a signature was verified.

Future Cloud Review

The local/free product will remain offline-first: metadata inspection, AST scanning, capability signals, local risk rules, and local JSON judgement.

A future Team/Cloud layer may add shared policy, hosted review history, GitHub PR summaries, audit logs, an agent install gateway, private package/tool review workflows, privacy controls, and enterprise/private deployment options. Billing and hosted cloud review are not implemented in this release.

Development

python -m venv .venv
.venv/bin/python -m pip install -e ".[dev]"
.venv/bin/python -m pytest
.venv/bin/python -m build
.venv/bin/python -m twine check dist/*

Release and process references:

Roadmap

  1. Expand agent policy validation, tool-specific agent judgement, and decision explanation.
  2. Broader optional PyPI/source lookup and cache.
  3. Tool dependency installation in the runner.
  4. Tool bundle signing and signature verification.
  5. Cloud/private remote registry backends.
  6. Cloud/model-backed review as an optional future service.

License

MIT License. See LICENSE.

Repository: https://github.com/devlukeg/pkgwhy

Issues: https://github.com/devlukeg/pkgwhy/issues

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pkgwhy-1.8.0.tar.gz (168.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pkgwhy-1.8.0-py3-none-any.whl (125.8 kB view details)

Uploaded Python 3

File details

Details for the file pkgwhy-1.8.0.tar.gz.

File metadata

  • Download URL: pkgwhy-1.8.0.tar.gz
  • Upload date:
  • Size: 168.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for pkgwhy-1.8.0.tar.gz
Algorithm Hash digest
SHA256 58df89cc3285c982b61f67e7879cae83cdb1294bd9dc757b2086a81baab1678e
MD5 f98f67155e949cb71efecda0cda2531d
BLAKE2b-256 eb67c433447741d1edb92208bacc6e012edd367c0682c37c097e309f3e47dac7

See more details on using hashes here.

File details

Details for the file pkgwhy-1.8.0-py3-none-any.whl.

File metadata

  • Download URL: pkgwhy-1.8.0-py3-none-any.whl
  • Upload date:
  • Size: 125.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for pkgwhy-1.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 601c9c5aeb10be22ce03044547b34ae6b3623c5c1b25ac65de5743bd28a5f317
MD5 455d4f52e7a253a27870403f6fabad7e
BLAKE2b-256 3bef6e0527b3b7aa78e86898cf0bea892a1f2483ab759cfbb5bfb4de44b454bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page