Skip to main content

Multi-signal codebase quality analyzer using mathematical primitives

Project description

Shannon Insight

CI PyPI Python License: MIT

Multi-signal codebase quality analyzer using information-theoretic primitives. Named after Claude Shannon, father of information theory.

Quick Start

pip install shannon-insight
shannon-insight /path/to/codebase
shannon-insight . --format json | jq .

What It Does

Shannon Insight scans your codebase and computes 5 orthogonal quality primitives per file, then fuses them with consistency-weighted scoring to surface files that need attention:

Primitive What it measures High means
Structural Entropy AST node type distribution Chaotic organization
Network Centrality PageRank on dependency graph Critical hub
Churn Volatility File modification recency Recently changed / unstable
Semantic Coherence Import/export focus Low: too many unrelated concerns
Cognitive Load Functions x complexity x nesting Overloaded file

Output Formats

# Rich terminal output (default) with summary dashboard
shannon-insight .

# Machine-readable JSON
shannon-insight . --format json

# Pipe-friendly CSV
shannon-insight . --format csv

# Just file paths (one per line)
shannon-insight . --format quiet

# Deep-dive on a specific file
shannon-insight . --explain complex.go

# Export to file
shannon-insight . --output report.json

CI Integration

Use --fail-above to gate CI pipelines on code quality:

# Fail if any file scores above 2.0
shannon-insight . --format quiet --fail-above 2.0

Example GitHub Actions step:

- name: Code quality gate
  run: shannon-insight . --fail-above 2.0 --format quiet

Configuration

Create shannon-insight.toml in your project root:

z_score_threshold = 1.5
fusion_weights = [0.2, 0.25, 0.2, 0.15, 0.2]
exclude_patterns = ["*_test.go", "vendor/*", "node_modules/*"]
max_file_size_mb = 10.0
enable_cache = true

Or use environment variables with SHANNON_ prefix:

export SHANNON_Z_SCORE_THRESHOLD=2.0
export SHANNON_ENABLE_CACHE=false

CLI Options

Options:
  PATH                      Path to codebase directory [default: .]
  -l, --language TEXT       Language (auto, python, go, typescript, react, javascript)
  -t, --top INTEGER         Number of top files to display [1-1000]
  -o, --output FILE         Export JSON report to file
  -f, --format TEXT         Output format: rich, json, csv, quiet
  -e, --explain TEXT        Deep-dive on matching file(s)
  --fail-above FLOAT        CI gate: exit 1 if max score exceeds threshold
  --threshold FLOAT         Z-score threshold for anomaly detection
  -c, --config FILE         TOML configuration file
  -v, --verbose             Enable DEBUG logging
  -q, --quiet               Suppress all but ERROR logging
  --no-cache                Disable caching
  --clear-cache             Clear cache before running
  -w, --workers INTEGER     Parallel workers [1-32]
  --version                 Show version and exit

Commands:
  cache-info    Show cache statistics
  cache-clear   Clear analysis cache

Supported Languages

  • Python - .py files
  • Go - .go files
  • TypeScript/React - .ts, .tsx files
  • JavaScript - .js, .jsx files (uses TypeScript scanner)

Language is auto-detected by default. Override with --language.

How It Works

CodebaseAnalyzer
  Layer 1: Scanning       - Language-specific file parsing
  Layer 2: Extraction     - Compute 5 orthogonal primitives per file
  Layer 3: Detection      - Z-score normalization + anomaly thresholding
  Layer 4: Fusion         - Consistency-weighted signal combination
  Layer 5: Recommendations - Root cause attribution + actionable advice

Signal fusion uses coefficient of variation to penalize inconsistent signals:

consistency = 1 / (1 + CV)
final_score = consistency * |weighted_average|

See docs/MATHEMATICAL_FOUNDATION.md for the full mathematical framework.

Development

git clone https://github.com/namanagarwal/shannon-insight.git
cd shannon-insight
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

make test          # Run tests with coverage
make lint          # Run ruff linter
make format        # Format with ruff
make type-check    # Run mypy
make all           # Format + lint + type-check + test

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE

Credits

Created by Naman Agarwal. Inspired by Claude Shannon's information theory, PageRank (Page & Brin), and cyclomatic complexity (McCabe).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shannon_codebase_insight-0.4.0.tar.gz (50.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shannon_codebase_insight-0.4.0-py3-none-any.whl (55.9 kB view details)

Uploaded Python 3

File details

Details for the file shannon_codebase_insight-0.4.0.tar.gz.

File metadata

  • Download URL: shannon_codebase_insight-0.4.0.tar.gz
  • Upload date:
  • Size: 50.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for shannon_codebase_insight-0.4.0.tar.gz
Algorithm Hash digest
SHA256 d846f93726c4c627442f6e61fd493e75a47d1bac7f0cff9868952174e11ce48c
MD5 20ec48783ffcdfa501141b66747dfc9c
BLAKE2b-256 4351e09cf259a19089337194d0b886668ffb746bcc3ca7168113c6d28df15990

See more details on using hashes here.

Provenance

The following attestation bundles were made for shannon_codebase_insight-0.4.0.tar.gz:

Publisher: publish.yml on namanag97/shannon-insight

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file shannon_codebase_insight-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for shannon_codebase_insight-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ee9ab0b7af62a3a7112c94d42992ce4fb109334830a1f445c2d7c9f9a581ec59
MD5 e8645dff055b178792a2b747a2936a9e
BLAKE2b-256 02da9a0349bbcc11d429e13897cc1a75e5c62772a6643b20b2ec99d95443ea84

See more details on using hashes here.

Provenance

The following attestation bundles were made for shannon_codebase_insight-0.4.0-py3-none-any.whl:

Publisher: publish.yml on namanag97/shannon-insight

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page