Skip to main content

Cryptographic discovery scanner: find quantum-vulnerable cryptography in your code and emit a CycloneDX CBOM.

Project description

cryptohound

A cryptographic discovery scanner for the post-quantum migration. Point it at a Python repository and it finds every use of cryptography that a future quantum computer could break, then emits a CycloneDX 1.6 CBOM (cryptographic bill of materials) and a human-readable report.

This finds problems. It does not fix them. cryptohound is the "you can't migrate what you can't see" step: discovery and reporting only. It does not change your code, manage keys, or perform a migration.

What it detects

Anything whose security rests on integer factorization or discrete logarithms — the families Shor's algorithm breaks:

Family Detected via
RSA cryptography, pycryptodome/PyCrypto (incl. pkcs1_15/pss), rsa, pyOpenSSL, paramiko
DSA cryptography, pycryptodome, pyOpenSSL, paramiko
ECDSA / EC cryptography, ecdsa, pycryptodome (DSS), paramiko
EdDSA (Ed25519/Ed448) cryptography, PyNaCl, paramiko
Diffie-Hellman (DH) cryptography
ECDH (incl. X25519/X448) cryptography
ElGamal pycryptodome
JWT asymmetric algs (RS*/PS*/ES*/EdDSA) PyJWT, python-jose, Authlib
Asymmetric private-key loading cryptography (load_*_private_key)

Detection works in two ways: source-code patterns (via a Semgrep rule pack) and an optional dependency-manifest scan (requirements*.txt, pyproject.toml, enabled with --include-deps) that flags known classical-crypto libraries as a second, lower-confidence signal.

Rules are fully qualified, so Semgrep's import resolution only matches genuine crypto-library calls — a local variable named rsa or a generic generate_key() method is never flagged. On a labeled benchmark cryptohound scores 100% precision and 100% recall, and on a corpus of 107 real repositories it produced zero false positives; see EVALUATION.md.

What it deliberately does NOT flag

To keep false positives low, quantum-resistant primitives are ignored: symmetric crypto at adequate sizes (AES-128/256) and SHA-256/384/512 hashing. JWT HS* (HMAC) algorithms are likewise not flagged.

Install

cryptohound uses Semgrep as its detection engine, which is installed as a dependency.

pip install cryptohound        # from a release
# or, from a clone:
pip install -e .

Requires Python 3.9+.

Usage

cryptohound path/to/repo

This writes cbom.json and report.md to the current directory and prints a ranked summary.

Found 7 quantum-vulnerable crypto asset(s):

  [    high] ECDH             keys.py:30
             key agreement is exposed to harvest-now-decrypt-later: traffic
             captured today can be decrypted retroactively; found in first-party
             source; 256-bit parameter
  [  medium] RSA              keys.py:9
             public-key encryption protects confidentiality that breaks
             retroactively; found in first-party source; 2048-bit key
  ...

Flags

Flag Description
-o, --output-dir DIR Where to write artifacts (default: current directory).
-f, --format {json,md,both} Which artifacts to emit (default: both).
--fail-on-severity {info,low,medium,high,critical} Exit non-zero if any finding is at or above this level. For CI.
--include-deps Also report known crypto libraries listed in dependency manifests. Off by default to keep false positives low.
-q, --quiet Suppress the console summary.
--version Print version.

Use in CI

- run: pip install cryptohound
- run: cryptohound . --format json --fail-on-severity high

The command exits 1 when a finding meets the threshold, failing the build.

Output

cbom.json

A valid CycloneDX 1.6 BOM. Each finding is a component of type cryptographic-asset with cryptoProperties (primitive, curve, key size) and cryptohound metadata under properties:

{
  "type": "cryptographic-asset",
  "name": "RSA-2048",
  "cryptoProperties": {
    "assetType": "algorithm",
    "algorithmProperties": { "primitive": "pke", "classicalSecurityLevel": 2048 }
  },
  "properties": [
    { "name": "cryptohound:quantum_vulnerable", "value": "true" },
    { "name": "cryptohound:quantum_reason", "value": "RSA relies on integer factorization, broken by Shor's algorithm." },
    { "name": "cryptohound:severity", "value": "medium" },
    { "name": "cryptohound:severity_reason", "value": "..." },
    { "name": "cryptohound:location", "value": "keys.py:9" }
  ]
}

report.md

Total findings, a severity-ranked table, and a short per-family "what to do next" note pointing at the relevant NIST PQC standard (ML-KEM / ML-DSA / SLH-DSA).

How severity is decided

True data-sensitivity needs human judgment, so cryptohound does not guess it. Severity is an explainable heuristic built only from signals the tool can observe, and the reasoning is always emitted alongside the level:

  • Primitive — key-agreement ranks highest (harvest-now-decrypt-later: captured traffic is decryptable retroactively), then signatures, then public-key encryption.
  • Key size — smaller classical keys are already weaker, raising urgency.
  • Detection locus — first-party source ranks above a dependency-only mention.
  • Third-party touch — usage flowing through a declared crypto dependency.

Treat the ranking as a starting point for triage, not a verdict.

Extending the rules

Detection rules live in src/cryptohound/rules/ as standard Semgrep YAML, one file per algorithm family. To add coverage, drop in a new rule with this metadata block and it flows through to the CBOM and report automatically:

rules:
  - id: cryptohound-<family>-<context>
    languages: [python]
    severity: WARNING
    message: "..."
    metadata:
      algorithm: <Name>       # e.g. RSA
      primitive: <primitive>  # signature | key-agree | pke | encryption
      quantum_vulnerable: true
      reason: "one-line why"
      library: <lib>
      family: <family>        # e.g. rsa
    patterns:
      - pattern-either:
          - pattern: your.api.call(...)

Run pytest to confirm the fixtures still pass.

Scope (v1)

In scope: Python source + manifest scanning, CBOM, report, CI gating.

Out of scope: other languages, hosted dashboards/UI/DB, network or TLS/cert scanning, HSM inspection, and any automated migration or code-fixing.

Troubleshooting

ModuleNotFoundError: No module named 'pkg_resources' when scanning. Semgrep depends (transitively) on pkg_resources, which ships inside setuptools. setuptools 81+ removed it, which breaks Semgrep. Pin an older one in your environment:

pip install "setuptools<81"

pip install -e . fails with "editable mode currently requires a setuptools-based build". Your pip is too old for pyproject.toml-only projects. Upgrade it first: pip install --upgrade pip setuptools wheel.

First scan is slow. Semgrep initializes its engine on first run (10–30s); subsequent runs are fast.

Development

pip install -e ".[dev]"
pytest -n auto             # ~100 tests, parallel (detection tests invoke Semgrep)
python benchmark/run_benchmark.py   # precision / recall on the labeled benchmark

The suite covers per-library detection, explicit false-positive guards (AES, SHA-2, HMAC, Fernet, locally-named variables), CLI flags and exit codes, dependency parsing, the severity heuristic, CycloneDX 1.6 validity, and a benchmark regression gate.

cryptohound is validated two ways — a labeled benchmark and a corpus of 107 real repositories; see EVALUATION.md for the precision/recall and false-negative analysis.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cryptohound-0.1.0.tar.gz (26.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cryptohound-0.1.0-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file cryptohound-0.1.0.tar.gz.

File metadata

  • Download URL: cryptohound-0.1.0.tar.gz
  • Upload date:
  • Size: 26.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for cryptohound-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9a74508d44dfaee31dc7d66419af681cd9ed81006b39d5c618b8eb40d73f71b7
MD5 56be7125b9d72a626c401796652b1076
BLAKE2b-256 2855f8259295a3affffac2c1a6c49e7c0a674834bf0d402d45f6131250ecaf4f

See more details on using hashes here.

File details

Details for the file cryptohound-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cryptohound-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for cryptohound-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2578d817351121c07ab27d515a4bcbd28146e1a84b5b5775d91fb4bbf5e32def
MD5 d32bb24971dfaf8dadfb71e950ca0618
BLAKE2b-256 e2e5c37b0152517bcaa5575cb728ca6daba3c095556df9708fdb48f2dbae9fc7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page