Skip to main content

Dead-code forensics CLI — find dead code, understand why it died, and safely delete it.

Project description

fossil — dead code forensics

Find dead code. Understand why it died. Safely delete it.

CI PyPI Python License Issues


fossil is a command-line forensics tool that goes beyond detecting dead code to explaining its historywhen it died, what killed it, who wrote it, why it existed, and whether it is genuinely safe to delete.

It combines static analysis, git history mining, and pattern detection into a single terminal command that answers in under 3 seconds.

Why fossil?

Every mature codebase accumulates dead code. Existing tools tell you what is dead. None of them tell you why.

Question Other Tools fossil
Is this file imported anywhere?
When did it become dead? ✅ — exact death commit with date
What PR replaced it? ✅ — PR number, title, author
Who wrote it originally? ✅ — original author from git blame
Is there a "keep for now" comment? ✅ — detects and verifies the condition
Is it safe to delete? ✅ — 0–100% confidence score
Can it auto-delete for me? ✅ — --yolo creates a PR

Installation

pip install fossil-code

Requires Python 3.11+ and git.

Quick Start

# Full forensic report for one file
fossil explain src/billing/legacy_processor.py

# Scan an entire directory
fossil scan ./src

# Machine-readable JSON output
fossil explain src/billing/legacy_processor.py --json

# Prioritized deletion backlog
fossil clean ./src --threshold 85

# Plain text mode (for piping / CI)
fossil explain src/billing/legacy_processor.py --plain

Example Output

╭─────────────────────────────────── fossil ───────────────────────────────────╮
│                                                                              │
│    FORENSIC REPORT  src/billing/legacy_processor.py                          │
│    Status  ● DEAD   Language  Python                                         │
│  ╭─────────────────────────────── History ────────────────────────────────╮  │
│  │  Dead since        2023-03-14                                          │  │
│  │  Death commit      a3f9b21  "Migrate to Stripe v3 — replace legacy     │  │
│  │                    SCA handler (#441)"                                 │  │
│  │  PR                #441 · Migrate to Stripe v3 — replace legacy SCA    │  │
│  │                    handler (#441)                                      │  │
│  │  Original by       Sarah Chen · first committed 2022-06-12             │  │
│  ╰────────────────────────────────────────────────────────────────────────╯  │
│  ╭──────────────────────────── Temporary Hold ────────────────────────────╮  │
│  │   Pattern  "keeping this around until Q2 rollout completes" (line 3)   │  │
│  │   Status   ✓ RESOLVED  —  PR #489 merged April 12, 2023                │  │
│  ╰────────────────────────────────────────────────────────────────────────╯  │
│  ╭─────────────────────────── Static Analysis ────────────────────────────╮  │
│  │   Call sites          0        Dynamic imports     0                   │  │
│  │   Import refs         0        Reflection          None detected       │  │
│  │   Test references     0        Config refs         0                   │  │
│  ╰────────────────────────────────────────────────────────────────────────╯  │
│  ╭────────────────────────────── Confidence ──────────────────────────────╮  │
│  │    91%  ██████████████████░░  HIGH CONFIDENCE · LOW RISK               │  │
│  ╰────────────────────────────────────────────────────────────────────────╯  │
│    Suggested   rm src/billing/legacy_processor.py                            │
│    Auto-PR     fossil explain src/billing/legacy_processor.py --yolo         │
│    Analysis duration: 1840ms                                                 │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

How It Works

For every file analyzed, fossil runs five stages in under 3 seconds:

┌──────────────┐    ┌──────────────┐    ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│   Static     │    │   Git        │    │   Pattern    │    │  Confidence  │    │   Output     │
│   Analysis   │───▶│   History    │───▶│   Detection  │───▶│   Scoring    │───▶│   Rendering  │
│              │    │   Mining     │    │              │    │              │    │              │
│ • imports    │    │ • death      │    │ • TODO:      │    │ • 14 signals │    │ • Rich panel │
│ • call sites │    │   commit     │    │   remove     │    │ • 0-100%     │    │ • JSON       │
│ • dynamic    │    │ • PR number  │    │ • DEPRECATED │    │ • risk label │    │ • plain text │
│ • reflection │    │ • author     │    │ • keep for   │    │              │    │              │
│              │    │ • blame      │    │   now        │    │              │    │              │
└──────────────┘    └──────────────┘    └──────────────┘    └──────────────┘    └──────────────┘

Confidence Score

The confidence score aggregates 14 weighted signals:

Signal Weight Direction
Zero call sites (static) +30 Positive
No dynamic references +20 Positive
Death commit identified +15 Positive
Temporary hold resolved +10 Positive
No reflection patterns +10 Positive
File age > 90 days dead +8 Positive
PR/migration context found +7 Positive
Dynamic import detected −30 Negative
Reflection/getattr detected −20 Negative
File modified < 30 days ago −20 Negative
"Keep for now" unresolved −15 Negative
Language unknown (fallback) −15 Negative
Test file references found −10 Negative
Death commit ambiguous −10 Negative

Risk labels: 85–100% High Confidence · Low Risk · 70–84% Medium-High · 55–69% Medium · <55% Low Confidence · High Risk

Commands

fossil explain <file>

Full forensic report for a single file.

fossil explain src/billing/legacy.py              # Rich panel output
fossil explain src/billing/legacy.py --json        # JSON output
fossil explain src/billing/legacy.py --plain       # Plain text
fossil explain src/billing/legacy.py --no-cache    # Skip cache
fossil explain src/billing/legacy.py --depth 2000  # Deeper git history
Flag Default Description
--json false Machine-readable JSON output
--plain false Plain text (no Rich formatting)
--no-color false Disable ANSI colors
--no-cache false Skip cache read/write
--depth N 500 Max git commits to traverse
--remote auto Force remote: github, gitlab, none, auto
--yolo false Create deletion PR if confidence ≥ 90%
--force-yolo false Create deletion PR regardless of confidence
--narrate false LLM narration (requires provider config)

fossil scan [directory]

Scan a directory for all dead files above a confidence threshold.

fossil scan ./src                       # Scan with default 70% threshold
fossil scan ./src --threshold 85        # Only high-confidence results
fossil scan ./src --language py,js      # Filter by language
fossil scan ./src --exclude "**/test*"  # Exclude patterns
fossil scan ./src --json                # JSON for CI pipelines

fossil clean [directory]

Prioritized deletion backlog — ranked by confidence.

fossil clean ./src --threshold 80       # Show deletion candidates
fossil clean ./src --dry-run            # Preview what would be done
fossil clean ./src --json               # Machine-readable output

fossil config

fossil config set github_token ghp_xxxx    # Store GitHub token
fossil config set llm_provider openai      # Configure LLM provider
fossil config show                          # Show config (tokens masked)

fossil cache

fossil cache clear    # Delete analysis cache
fossil cache stats    # Show cache statistics

Exit Codes

Code Meaning CI Use
0 Dead code found, report generated Fail CI check
1 Unexpected error Fail CI check
2 File not found Fail CI check
3 Not a git repository Fail CI check
4 File is NOT dead (actively used) Pass CI check

CI Integration

GitHub Actions

- name: Check for dead code
  run: |
    pip install fossil-code
    fossil scan . --threshold 90 --json > dead_report.json
    # Exit 0 = dead code found above 90% → fail the step
    # Exit 4 = no dead code above 90% → pass

Pre-commit Hook

#!/bin/bash
fossil scan . --threshold 90 --json --no-cache > /dev/null 2>&1
if [ $? -eq 0 ]; then
  echo "⚠️  Dead code detected above 90% confidence. Run 'fossil scan .' for details."
  exit 1
fi

Configuration

User Config: ~/.config/fossil/config.toml

github_token = ""
gitlab_token = ""
llm_api_key = ""
llm_provider = "openai"
llm_model = "gpt-4o-mini"
default_depth = 500
cache_ttl_hours = 24

Project Config: .fossil.toml

Commit this to your repo root so the whole team shares settings:

[analysis]
languages = ["py", "js", "ts"]
exclude_patterns = ["**/migrations/**", "**/generated/**"]

[thresholds]
minimum_confidence = 70
yolo_minimum_confidence = 90

[pr]
base_branch = "main"
pr_labels = ["dead-code-cleanup", "automated"]

Environment Variables

All environment variables override config file values:

Variable Purpose
GITHUB_TOKEN GitHub API authentication
GITLAB_TOKEN GitLab API authentication
FOSSIL_LLM_API_KEY LLM provider API key
FOSSIL_LLM_PROVIDER LLM provider (openai / anthropic / ollama)
FOSSIL_LLM_MODEL LLM model name
NO_COLOR Disable ANSI colors (no-color.org)

Language Support

Language Analyzer Capability
Python ast module Deep import, call, dynamic import, reflection analysis
JavaScript Text fallback Filename/symbol reference search
TypeScript Text fallback Filename/symbol reference search
Java Text fallback Filename/symbol reference search
Go Text fallback Filename/symbol reference search
Other Text fallback Filename reference search

Python gets the deepest analysis via the ast module. Other languages use conservative text-based reference search as a fallback. tree-sitter integration for deeper multi-language analysis is planned.

Offline by Default

fossil works with zero network access. The core analysis pipeline (static analysis → git mining → pattern detection → confidence scoring) runs entirely offline.

Network is only used for three optional features:

  • GitHub/GitLab API — PR title/body lookup, --yolo PR creation
  • LLM API — --narrate natural language explanation

Development

git clone https://github.com/iamvvek/fossil.git
cd fossil
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

# Run tests (85+ tests)
pytest -v

# Lint
ruff check src/ tests/

See CONTRIBUTING.md for the full development guide.

Roadmap

  • Phase 1 — Python forensics with Rich output
  • Phase 2 — Multi-language scan, pattern detection, parallel processing
  • Phase 3 — GitHub/GitLab API integration, --yolo PR creation
  • Phase 4 — LLM narration, VS Code extension

License

MIT — use it, fork it, ship it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fossil_code-0.2.0.tar.gz (38.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fossil_code-0.2.0-py3-none-any.whl (29.8 kB view details)

Uploaded Python 3

File details

Details for the file fossil_code-0.2.0.tar.gz.

File metadata

  • Download URL: fossil_code-0.2.0.tar.gz
  • Upload date:
  • Size: 38.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fossil_code-0.2.0.tar.gz
Algorithm Hash digest
SHA256 306104f7a65fdf7c4a5b3d0cbc30164fc68f4a4838c2681c47bd4db59d0beeeb
MD5 2a873a4688f7efdd1f840e3d89e13e30
BLAKE2b-256 2b0c6735cb01af4684eef47b94feafdac57fb2055888b410832cbb2bba4a4832

See more details on using hashes here.

Provenance

The following attestation bundles were made for fossil_code-0.2.0.tar.gz:

Publisher: release.yml on iamvvekverma/fossil

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fossil_code-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: fossil_code-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 29.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fossil_code-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d7320f1b7f69d2cb59248f16a35b59ad168f5ce7d901c423533630f1e3c03b60
MD5 10b143009cb8b86c536da2e3ad792bf2
BLAKE2b-256 d33a58228809dfbcad87a26113eb4a82fa8d19d1aa86d68b1f68e8b0d60611bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for fossil_code-0.2.0-py3-none-any.whl:

Publisher: release.yml on iamvvekverma/fossil

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page