Skip to main content

Vet the packages and repos your AI assistant recommended — before you install. Catches hallucinated/slopsquatted names, malware signals, license traps, dead repos, and fake-star inflation.

Project description

pkgguard

Vet the packages and repos your AI assistant recommended — before you install them.

CI PyPI Python License: MIT No API key Ecosystems


Your AI coding assistant just gave you a pip install line or a list of "recommended libraries." Some of those names don't exist. Some are one keystroke away from a popular package. Some are real but AGPL-licensed, abandoned, riddled with CVEs, or propped up by bought stars. Installing them is how modern supply-chain attacks start.

pkgguard checks every name in one command and tells you which are safe — across PyPI, npm, crates.io, Go, RubyGems, Packagist, NuGet and pub.dev — with no API key and without executing a single line of the packages it inspects.

$ pkgguard requests reqeusts beautifulsoup-4 django==3.0.0 super-fake-pkg-zzz

      ✅ OK      requests           Exists on PyPI · 1.6B downloads/mo · Apache-2.0
      ⚠️ WARN    django             31 known CVEs in 3.0.0 — upgrade
      ❌ DANGER  reqeusts           Resembles 'requests' but does not exist — slopsquat bait
      ❌ DANGER  beautifulsoup-4    Hallucination — real package is 'beautifulsoup4'
      ❌ DANGER  super-fake-pkg-zzz Not found on PyPI or npm — likely invented

1 ok  1 warn  3 danger  0 unknown          # exit code 2 → fails your CI

Existing tools (Snyk, Socket, OSV) scan dependencies you've already chosen. pkgguard answers the newer, earlier question: "the thing the LLM just told me to install — is it even real, and should I trust it?"


🩸 Why this matters: the slopsquatting epidemic

LLMs invent package names. A peer-reviewed USENIX Security 2025 study generated 2.23M code samples and found:

  • 19.7% of AI-generated samples referenced a package that does not exist (up to ~33% for some models).
  • When the same prompt was repeated, 43% of hallucinated names appeared every single time — they are predictable.

That predictability is the attack. An adversary asks an LLM what it hallucinates, pre-registers those exact names on PyPI/npm with malware inside, and waits for the next developer to copy-paste the assistant's answer into a terminal. Security researchers named it slopsquatting, and it is already happening in the wild.

As AI assistants become the default way developers discover dependencies, the moment an AI hands you a package list is now a front-line security boundary. pkgguard guards exactly that moment.


⚡ What it checks

Every package/repo runs through an ordered pipeline of 10 checks. The worst finding sets the verdict, so a single command gives you one clear answer per item.

Check What it catches
Existence Names that don't exist on any registry — the core hallucination / slopsquat signal
Typosquat + homoglyph 1–2 edits from a popular package (reqeustsrequests) and digit/letter look-alikes (dj4ng0django)
Known vulnerabilities Open CVEs/advisories for the resolved version via OSV.dev (GHSA / PyPA / RustSec / RubySec…)
Source malware scan (opt-in --scan) Statically inspects the package archive for install-time code execution, obfuscated payloads, child_process/os.system, credential access — without ever running it
License traps AGPL / SSPL / BUSL / CC-BY-NC / fair-code / "no license" — landmines for commercial products
Maintenance Archived, disabled, deprecated, or long-abandoned projects
Popularity Download counts as a legitimacy signal
Fake stars (opt-in --deep) Star-count inflation: implausible growth and burst-buying patterns
Repo health GitHub stars / last commit / license / archived state for the upstream repo
Malware metadata npm install scripts, freshly-registered look-alikes, packages with no auditable source

Supported ecosystems: PyPI · npm · crates.io · Go modules · RubyGems · Packagist · NuGet · pub.dev — plus GitHub repos.


📦 Install

pip install pkgguard            # core — zero dependencies, no API key
pip install "pkgguard[rich]"    # + prettier coloured tables
pip install "pkgguard[mcp]"     # + MCP server for AI assistants

Requires Python 3.9+. The core engine is stdlib-only.

🚀 Quick start

# a few names
pkgguard requests numpy pandas

# a manifest — auto-detected: requirements.txt, package.json, Cargo.toml, Gemfile, go.mod
pkgguard requirements.txt
pkgguard package.json Cargo.toml          # several at once

# 🌟 the headline trick: paste whatever ChatGPT / Claude told you
pkgguard --stdin < chat.txt
pbpaste | pkgguard --stdin                # macOS

# go deeper
pkgguard requirements.txt --scan          # download + statically scan source
pkgguard some/repo --deep                 # add fake-star analysis

pkgguard mines free text for pip install … / npm i … commands, GitHub links, inline `code spans` and bullet lists — and is careful not to flag plain English prose as packages. Names whose ecosystem isn't stated are checked against both PyPI and npm.

Machine-readable output & CI

pkgguard -f requirements.txt --json               # JSON to stdout
pkgguard -f requirements.txt --markdown -o report.md
pkgguard -f requirements.txt --fail-on warn       # non-zero exit gates your pipeline

Exit codes: 0 clean · 1 a warning (--fail-on warn) · 2 a danger.


🔌 Integrations

pre-commit hook
# .pre-commit-config.yaml
- repo: https://github.com/Highcrypto7/pkgguard
  rev: v0.1.0
  hooks:
    - id: pkgguard          # auto-runs on requirements*.txt and package.json
GitHub Action
# .github/workflows/pkgguard.yml
- uses: Highcrypto7/pkgguard@v1
  with:
    files: "requirements.txt package.json"
    fail-on: danger
MCP server — let the assistant check its own answer
pip install "pkgguard[mcp]"
pkgguard-mcp          # exposes vet_packages() and is_safe_to_install() over MCP

Register pkgguard-mcp in Claude Desktop / Cursor and the assistant can vet a package before it ever recommends it — stopping slopsquatting at the source.


🥊 How it compares

Other tools are excellent at scanning dependencies you've already chosen. pkgguard is the fast first gate at the moment an AI (or a teammate) hands you a list.

pkgguard sloppy-joe depscope GuardDog Snyk / Socket
Hallucination / existence ~
Typosquat + homoglyph ~ ~
Known CVEs (OSV)
Static source malware scan
License traps (AGPL / NC / …) ~
Maintenance / dead repo ~ ~
Fake-star inflation
Paste a chat answer (free text)
MCP self-check for assistants
Ecosystems 8 2 19 5 many
Open source · No key · Offline-degraded partial

pkgguard's edge: the widest set of checks in a single zero-key OSS gate, framed around AI output — including license and fake-star checks the others skip, and a "paste the chat answer" workflow nobody else has. Honest gap: GuardDog/Snyk/Socket do deeper source-level malware analysis; run them alongside pkgguard for defence in depth.


📊 Proof

  • Benchmark: 100% accuracy on a labeled set of 30 PyPI/npm packages (15 real, 15 hallucinated/typosquat). Reproduce: python benchmark/run_benchmark.py. See BENCHMARK.md.
  • Zero false positives when vetting the 50 most popular real PyPI/npm packages.
  • 67 automated tests, deterministic and offline.

🧭 Verdicts

  • OK — exists and nothing concerning found.
  • ⚠️ WARN — usable, but read the caveat (license, CVE, staleness, look-alike…).
  • DANGER — doesn't exist, or a strong risk signal. Don't install without verifying.
  • UNKNOWN — couldn't determine (offline / rate-limited). Honest about what it didn't check.

🔐 Design principles

  • No API key, ever. Public registry/GitHub metadata over HTTPS. Set GITHUB_TOKEN only to raise rate limits.
  • No code execution. The source scan parses with ast and pattern-matching; it never imports or runs package code, and extracts archives in-memory with strict size/path guards.
  • Honest by default. "Couldn't check" is ❔, never a silent ✅.
  • Fast & offline-friendly. On-disk response cache; a previous run answers even with no network.

⚠️ Limitations

  • Heuristics, not proof. A ✅ means "no red flags found," not a security guarantee.
  • The typosquat reference list is a curated set of popular packages, not all of every registry.
  • Fake-star and source-scan checks are opt-in and intentionally conservative — they complement, not replace, dedicated tools (StarScout, GuardDog).
  • Unauthenticated GitHub is limited to ~60 requests/hour; set GITHUB_TOKEN for large runs.

🗺️ Roadmap

  • More ecosystems (Maven, Hex, CPAN)
  • Large-scale benchmark against the trendmicro/slopsquatting dataset
  • VS Code extension
  • Deeper static source analysis

🤝 Contributing

Issues and PRs welcome — a new ecosystem is just a registry adapter, and a new check is a single module (see src/pkgguard/checks/). Run pytest before submitting.

📄 License

MIT — see LICENSE. Built to make the AI coding era a little safer.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pkgguard_cli-0.1.0.tar.gz (61.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pkgguard_cli-0.1.0-py3-none-any.whl (70.7 kB view details)

Uploaded Python 3

File details

Details for the file pkgguard_cli-0.1.0.tar.gz.

File metadata

  • Download URL: pkgguard_cli-0.1.0.tar.gz
  • Upload date:
  • Size: 61.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for pkgguard_cli-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b12f1ad44d58867464c41cc480062bc395659899dc7fd924b238ca714b7274bc
MD5 1ee45fa06bf2b215503fbf13d148986e
BLAKE2b-256 ac8e3da6b8691885c24082e8b01c55fae227b46237d70a538f11e66065df2a37

See more details on using hashes here.

File details

Details for the file pkgguard_cli-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pkgguard_cli-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 70.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for pkgguard_cli-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b345688cefb5d739b591ed2741c5584ec4d2c09a2e8178c632ecc39c36aed7c9
MD5 fe8fa526d029a00027f1a2590b43a5f4
BLAKE2b-256 3795b0c753335958d67410e58a6e88284927a84471582ebafb103690b5a03aa1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page