Skip to main content

A universal CLI tool that compares anything and explains what changed and what might break

Project description

Python 3.10+ MIT License Cross-platform

what-changed

Compare anything. Understand everything. Break nothing.

A universal CLI tool that compares files, folders, configs, documents, APIs, and dependencies—then explains what changed and what could break.


Why what-changed?

Traditional diff tools show you what is different. what-changed tells you why it matters:

┌─────────────────────────────────────────────────────────────────────────────┐
│ Traditional Diff              │ what-changed                                │
├───────────────────────────────┼─────────────────────────────────────────────┤
│ < host: localhost             │ [X] BREAKING: database.host changed         │
│ > host: prod-db.example.com   │                                             │
│                               │ Impact: Database connection points to prod  │
│                               │ Could break: Local dev environments         │
│                               │ Risk Score: 9/10                            │
└─────────────────────────────────────────────────────────────────────────────┘

Key Features

Feature Description
Smart Detection Auto-identifies 25+ file formats
Semantic Diffing Understands structure, not just text
Risk Scoring Predicts what might break (0-10 scale)
OCR Support Extracts text from scanned PDFs & images
Git Integration Compare commits, branches, and tags
Interactive TUI Explore changes in a terminal UI
CI/CD Ready Exit codes for automation pipelines

Installation

pip install what-changed

Optional Extras

# Full installation with all formats
pip install what-changed[all]

# Individual extras
pip install what-changed[pdf]      # PDF support (PyMuPDF)
pip install what-changed[ocr]      # OCR for scanned docs (Tesseract)
pip install what-changed[openapi]  # OpenAPI/Swagger specs
pip install what-changed[excel]    # Excel spreadsheets
pip install what-changed[image]    # Image comparison
pip install what-changed[sql]      # SQL schema analysis

From Source

git clone https://github.com/yourusername/what-changed
cd what-changed
pip install -e ".[all]"

Quick Start

# Compare any two files
what-changed compare old.json new.json

# Compare directories
what-changed compare ./v1 ./v2

# Show only breaking changes
what-changed compare config.yaml config.prod.yaml --breaking-only

# Interactive graph view
what-changed compare api-v1.yaml api-v2.yaml --graph

# Visual stats dashboard
what-changed compare before/ after/ --stats

# Git integration
what-changed git HEAD~1..HEAD
what-changed git main..feature-branch

Supported Formats

Configuration Files

Format Extensions What's Analyzed
JSON .json Keys, values, nested structures
YAML .yaml, .yml Full YAML including anchors
TOML .toml Sections, tables, arrays
INI .ini, .cfg Sections and key-value pairs
ENV .env, .env.* Environment variables

Documents

Format Extensions What's Analyzed
PDF .pdf Pages, text, TOC, metadata, OCR for scanned docs
Markdown .md Headings, links, code blocks, sections
Plain Text .txt Line-by-line diff

API Specifications

Format Extensions What's Analyzed
OpenAPI 3.x .yaml, .json Endpoints, schemas, parameters, responses
Swagger 2.0 .yaml, .json Full spec comparison

Data Files

Format Extensions What's Analyzed
CSV .csv Schema, columns, rows, statistics
TSV .tsv Tab-separated values
Excel .xlsx, .xls Multiple sheets, cells

Infrastructure

Format Files What's Analyzed
Dockerfile Dockerfile Base image, instructions, ports, health checks
Docker Compose compose.yaml Services, volumes, networks
SQL .sql Tables, columns, indexes, constraints

Dependencies

Ecosystem Files What's Analyzed
Node.js package.json Dependencies, scripts, versions
Python requirements.txt, pyproject.toml Packages, version constraints
Ruby Gemfile Gems, sources
Rust Cargo.toml Crates, features
Go go.mod Modules, versions

Media & Archives

Format Extensions What's Analyzed
Images .png, .jpg, .gif, .webp Dimensions, metadata, OCR text extraction
Archives .zip, .tar, .tar.gz File listing, content changes

Output Modes

Human-Readable (Default)

what-changed compare config.old.json config.new.json
┌─────────────────────────────────────────────────────────────────────────────┐
│ Configuration Comparison                                                    │
│   A: config.old.json                                                        │
│   B: config.new.json                                                        │
└─────────────────────────────────────────────────────────────────────────────┘

[X] 1 breaking  [!] 2 risky  [o] 3 safe

┌─────────── Changes ───────────┐
│ config                        │
│ ├── database (2 changes)      │
│ │   ├── - host <str>          │
│ │   │     localhost           │
│ │   │     -> prod-db.com      │
│ │   └── ~ port <int>          │
│ │         5432 -> 5433        │
│ └── + cache (3 changes)       │
│     └── enabled = true        │
└───────────────────────────────┘

Interactive Graph TUI

what-changed compare api-v1.yaml api-v2.yaml --graph

Navigate changes with keyboard shortcuts:

  • j/k or arrows: Navigate
  • Enter: Expand/collapse
  • b: Show breaking only
  • q: Quit

Visual Stats Dashboard

what-changed compare before/ after/ --stats

Shows ASCII charts with change distribution, risk breakdown, and file type statistics.

JSON Output

what-changed compare config.json config.prod.json --json

Summary Mode

what-changed compare src/ dist/ --summary

Git Integration

Compare any git references directly:

# Compare with previous commit
what-changed git HEAD~1..HEAD

# Compare branches
what-changed git main..feature-branch

# Compare tags
what-changed git v1.0.0..v2.0.0

# Compare specific file across commits
what-changed git HEAD~5..HEAD -p src/config.json

# Show only breaking changes
what-changed git main..develop --breaking-only

OCR Support

Scanned PDFs

Automatically extracts text from scanned PDF documents using Tesseract OCR:

what-changed compare scanned_v1.pdf scanned_v2.pdf
┌─────────────────────────────────────────────────────────────────────────────┐
│ PDF Document Comparison                                                     │
│   A: scanned_v1.pdf (OCR detected)                                          │
│   B: scanned_v2.pdf (OCR detected)                                          │
└─────────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────── page_1  ──────────────────────────────────┐
│ -Invoice #12345                                                             │
│ +Invoice #12345 - REVISED                                                   │
│ -Total: $450                                                                │
│ +Total: $550                                                                │
└─────────────────────────────────────────────────────────────────────────────┘

Images with Text

Compare images and extract text changes:

what-changed compare screenshot_old.png screenshot_new.png

Requirements: Install Tesseract OCR for your platform.


Risk Scoring

Changes are scored 0-10 based on potential impact:

Score Level Examples
8-10 BREAKING Removed endpoints, deleted dependencies, schema type changes
5-7 RISKY Major version bumps, config value changes, deprecated features
0-4 SAFE Added optional fields, new dependencies, comment changes

Common Risk Patterns

Change Pattern Score Why
API endpoint removed 10 Clients will fail
Required field added 9 Existing data won't validate
Database host changed 9 Connection failures
Dependency removed 9 Runtime errors
Major version bump 7 Breaking API changes
Port number changed 6 Connection issues
New optional field 2 Backward compatible
Comment updated 1 No runtime impact

Exit Codes

Perfect for CI/CD pipelines:

Code Meaning Action
0 All changes safe Deploy freely
1 Risky changes found Review recommended
2 Breaking changes found Block deployment
3 Error occurred Check logs
# In CI/CD pipeline
what-changed compare staging.env production.env --breaking-only
if [ $? -eq 2 ]; then
  echo "Breaking changes detected! Blocking deployment."
  exit 1
fi

CLI Reference

what-changed compare [OPTIONS] SOURCE_A SOURCE_B

Arguments:
  SOURCE_A    First source (file, directory, or URL)
  SOURCE_B    Second source (file, directory, or URL)

Options:
  -s, --summary        Brief summary only
  -b, --breaking-only  Show only breaking changes
  -j, --json           Output as JSON
  -v, --verbose        Show rule matches and scores
  --no-color           Disable colors
  --graph              Interactive TUI graph view
  --stats              Visual statistics dashboard
  --help               Show help message

what-changed git [OPTIONS] REF_SPEC

Arguments:
  REF_SPEC    Git reference (e.g., HEAD~1..HEAD, main..feature)

Options:
  -p, --path PATH      Compare specific file only
  -b, --breaking-only  Show only breaking changes
  --help               Show help message

what-changed detect FILE

  Detect and display the file type.

Programmatic Usage

from what_changed import compare

# Simple comparison
result = compare("old.json", "new.json")

print(f"Breaking changes: {result.breaking_count}")
print(f"Risky changes: {result.risky_count}")
print(f"Safe changes: {result.safe_count}")

for change in result.breaking_changes:
    print(f"  - {change.path}: {change.summary}")

Advanced Usage

from what_changed.normalize import create_default_registry
from what_changed.diff import SemanticDiff
from what_changed.graph import GraphBuilder
from what_changed.rules import create_default_engine
from what_changed.explain import Explainer

# Full pipeline
registry = create_default_registry()
obj_a = registry.normalize("config.old.json")
obj_b = registry.normalize("config.new.json")

differ = SemanticDiff()
diff_result = differ.diff(obj_a, obj_b)

builder = GraphBuilder()
graph = builder.build(diff_result)

engine = create_default_engine()
rule_results = engine.apply(graph)

explainer = Explainer()
for exp in explainer.explain_graph(graph, rule_results):
    print(f"{exp.risk_level.name}: {exp.change_summary}")

Edge Case Handling

what-changed handles edge cases gracefully:

Scenario Behavior
Password-protected PDF Clear error message
Malformed JSON/YAML Parse error with line number
Empty files Shows structure changes
Binary files Falls back to size comparison
Missing files Clear "not found" error
Identical files "No changes detected"
100+ page PDFs Paginated output (20 items max)
Unicode/Emoji content Full UTF-8 support
Files without extension Attempts content-based detection

Architecture

what-changed/
├── cli/              # Command-line interface (Typer)
├── detect/           # File type detection
├── normalize/        # Format-specific parsers
│   ├── config.py     # JSON, YAML, TOML, INI, ENV
│   ├── pdf.py        # PDF with OCR support
│   ├── openapi.py    # OpenAPI/Swagger
│   ├── data.py       # CSV, Excel
│   ├── docker.py     # Dockerfile, Compose
│   ├── dependencies.py # package.json, requirements.txt
│   ├── image.py      # Images with OCR
│   ├── sql.py        # SQL schemas
│   └── archive.py    # ZIP, TAR
├── diff/             # Semantic diff engine
├── graph/            # Change graph construction
├── rules/            # Risk scoring heuristics
├── explain/          # Human-readable explanations
├── output/           # Renderers (terminal, JSON)
│   └── formats/      # Format-specific renderers
├── tui/              # Interactive terminal UI
└── git/              # Git integration

Philosophy

  • Clarity over cleverness — Readable, maintainable code
  • Extensible by design — Easy to add formats and rules
  • Deterministic results — No ML/AI, fully reproducible
  • Offline-first — No external API calls
  • Cross-platform — Windows, macOS, Linux

Contributing

# Clone and install
git clone https://github.com/yourusername/what-changed
cd what-changed
pip install -e ".[dev,all]"

# Run tests
pytest

# Type checking
mypy what_changed

# Linting
ruff check what_changed

License

MIT License — see LICENSE for details.


Built with care for developers who hate surprises in production.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

what_changed-0.2.1.tar.gz (184.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

what_changed-0.2.1-py3-none-any.whl (135.9 kB view details)

Uploaded Python 3

File details

Details for the file what_changed-0.2.1.tar.gz.

File metadata

  • Download URL: what_changed-0.2.1.tar.gz
  • Upload date:
  • Size: 184.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for what_changed-0.2.1.tar.gz
Algorithm Hash digest
SHA256 34f5e440df6ad690c9ffd6a5eb61a22f7459b98559a2e61aa0e5507cb195ec65
MD5 4fd9603d3d9718b4f63870a3c06d6f1a
BLAKE2b-256 c878df5ded0c97a9942f04293e790eda6b37889bc82f3160a8baa8b7874fa586

See more details on using hashes here.

Provenance

The following attestation bundles were made for what_changed-0.2.1.tar.gz:

Publisher: publish.yml on aayushadhikari7/what-changed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file what_changed-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: what_changed-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 135.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for what_changed-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9a58018e0bcddae639b6b8fd9909395ae4d040c8899a0825ec818fbee68da66d
MD5 f2f72b547bfa6835484976aa70ea1797
BLAKE2b-256 f5440b30d9fd9842e2f970df628eb468749be91517a95f11ae2d130f5783e823

See more details on using hashes here.

Provenance

The following attestation bundles were made for what_changed-0.2.1-py3-none-any.whl:

Publisher: publish.yml on aayushadhikari7/what-changed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page