Comprehensive codebase analysis library with coverage, metrics, and test duration analysis
Project description
Codebase Stats
Production-ready code quality metrics analysis library for Python projects. Comprehensive analysis of coverage, complexity, maintainability, and code structure with automated reporting and quality gates.
Quick Start
Installation
# Using uv (recommended)
uv pip install codebase-stats
# Or with pip
pip install codebase-stats
Basic Usage
from codebase_stats import CodebaseStatsReporter
# Generate comprehensive metrics report
reporter = CodebaseStatsReporter(
coverage_file='coverage.json',
report_file='report.json',
radon_root='src',
fs_root='src',
tree_root='src'
)
# Save report to file
reporter.save_report('metrics_report.txt', include_coverage=True, include_complexity=True)
Command Line Interface
# Full analysis with all metrics
python cli.py coverage.json --radon-root src --fs-root src
# Show specific sections
python cli.py coverage.json --show coverage complexity mi
# List low-coverage files
python cli.py coverage.json --show list --threshold 80 --top 20
# Help
python cli.py --help
Features
📊 Coverage Analysis
- Statement & Branch Coverage: Full coverage metrics from pytest-cov
- Coverage Distribution: Histogram visualization with percentiles (Q1, Q2, Q3, p90, p95, p99)
- Low-Coverage Detection: Identify files below thresholds with automatic prioritization
- Pragma Tracking: Track
# pragma: no coverusage for documentation
🔍 Code Complexity
- Cyclomatic Complexity (CC): Radon integration for function complexity analysis
- Grade A: 1-5 (ideal)
- Grade B: 6-10 (acceptable)
- Grade C+: 11+ (refactor recommended)
- Maintainability Index (MI): Code readability/maintainability scoring (0-100)
- Halstead Metrics: Bug estimation and code volume analysis
- Comment Ratios: Documentation density analysis
📈 Code Metrics
- File Size Distribution: Line count per file with outlier detection
- Directory Structure Analysis: Module organization and hierarchy
- Test Duration Distribution: Identify slow tests
- Quality Gates: Automated threshold validation
📋 Reporting
- Histogram Visualization: ASCII histograms with customizable bins and scaling
- Blame Sections: Highlight problematic files (Q3 + 1.5×IQR threshold)
- Percentile Analysis: Q1, median, Q3, p90, p95, p99
- Structured Output: Markdown, text, or programmatic JSON
Documentation
- Architecture Documentation - System design, data flows, and component details
- Development Roadmap - Release planning and feature roadmap
- Architecture Decision Records - Design decisions and rationales
- Governance & Quality Gates - Development workflow and CI/CD policies
API Reference
Core Classes
CodebaseStatsReporter
Main interface for generating comprehensive metrics reports.
from codebase_stats import CodebaseStatsReporter
reporter = CodebaseStatsReporter(
coverage_file: str, # Path to coverage.json
report_file: str = None, # Path to pytest report.json (optional)
radon_root: str = None, # Root for CC/MI/comment analysis
fs_root: str = None, # Root for file size analysis
tree_root: str = None # Root for structure analysis
)
# Methods
reporter.save_report(filename, include_coverage=True, include_complexity=True)
reporter.get_stats() -> dict # Get raw metrics dictionary
report_text = str(reporter) # Generate text report
Data Structure
Coverage Stats Dictionary
stats = {
"coverages_sorted": [float], # All file coverage %
"proj_pct": float, # Project-wide coverage %
"proj_total": int, # Total lines
"proj_covered": int, # Covered lines
"file_stats": [
{
"pct": float, # File coverage %
"path": str, # File path
"missing_count": int, # Missing line count
"missing_lines": [int], # Missing line numbers
"cc_avg": float, # Avg cyclomatic complexity
"mi": float, # Maintainability index
"comment_ratio": float, # Comment/SLOC ratio
"hal_bugs": float, # Halstead bug estimate
"size_lines": int, # File line count
...
}
]
}
Module APIs
coverage.py - Coverage Analysis
from codebase_stats.coverage import load_coverage, precompute_coverage_stats
stats = load_coverage('coverage.json')
stats = precompute_coverage_stats(stats, radon_root='src')
metrics.py - Complexity Metrics
from codebase_stats.metrics import get_cyclomatic_complexity, get_maintainability, get_comments_ratio
cc = get_cyclomatic_complexity('file.py')
mi = get_maintainability('file.py')
ratio = get_comments_ratio('file.py')
radon.py - Radon Integration
from codebase_stats.radon import get_cc_list, get_mi_list, get_metrics
cc_data = get_cc_list('src')
mi_data = get_mi_list('src')
hal_data = get_metrics('src')
reporter.py - Report Generation
from codebase_stats.reporter import CodebaseStatsReporter
reporter = CodebaseStatsReporter(...)
reporter.save_report('output.txt') # Save formatted report
utils.py - Utilities
from codebase_stats.utils import ascii_histogram, percentile, format_value
hist_str = ascii_histogram(data, bins=10, width=80)
p95 = percentile(data, 0.95)
formatted = format_value(value, decimals=2)
Quality Gates
All code must meet these thresholds:
| Metric | Threshold | Rationale |
|---|---|---|
| Coverage | 100% | All source code must be tested |
| Cyclomatic Complexity | ≤10 average | Grade B maintainability |
| Maintainability Index | ≥50 | Grade A minimum |
| File Size | ≤400 lines | Modules remain manageable |
Development Workflow
Setup
# Clone repository
git clone https://github.com/brunolnetto/codebase-stats.git
cd codebase-stats
# Install with dev dependencies
uv venv
uv pip install -e ".[dev]"
# Activate environment
source .venv/bin/activate
Testing
# Run all tests
pytest
# With coverage
pytest --cov=codebase_stats --cov-report=term-plus
# Specific test file
pytest tests/test_coverage.py -v
Code Quality
# Linting
ruff check codebase_stats/ tests/
# Format check
ruff format --check codebase_stats/ tests/
# Type checking
mypy codebase_stats/
# All quality checks
make quality
Commits & PRs
This project uses GitFlow workflow with conventional commits:
# Feature branches
git checkout -b feat/feature-name
# Fix branches
git checkout -b fix/issue-name
# Chore/documentation
git checkout -b chore/update-name
# Commit format: <type>(<scope>): <description>
git commit -m "feat(coverage): add pragma tracking"
git commit -m "fix(radon): handle empty files gracefully"
git commit -m "docs(readme): add API reference"
See Development Policy for full workflow details.
Architecture Highlights
Data Flow:
Raw Input (coverage.json, report.json)
↓
load_coverage() → Enrich with Radon metrics
↓
precompute_coverage_stats() → Compute distributions
↓
Display Functions → Histograms, tables, blame sections
↓
Reporter → Formatted text/markdown output
Key Design Patterns:
- Single Responsibility: Each module handles one analysis type
- Composition: Reporter combines multiple analysis modules
- Lazy Evaluation: Radon metrics computed on-demand
- Immutable Data: Stats dicts treated as read-only
- Histogram Abstraction: Consistent visualization across metrics
See Architecture Documentation for detailed system design.
Examples
Generate Full Metrics Report
# After running tests
pytest --cov=src --cov-report=json
# Generate and save report
python cli.py coverage.json \
--radon-root src \
--fs-root src \
--tree-root src \
--report metrics_report.txt
Analyze Coverage Gaps
python cli.py coverage.json --show coverage gaps
Monitor Complexity Trends
from codebase_stats import CodebaseStatsReporter
def check_complexity_trend():
reporter = CodebaseStatsReporter('coverage.json', radon_root='src')
stats = reporter.get_stats()
cc_values = [f['cc_avg'] for f in stats['file_stats']]
avg_cc = sum(cc_values) / len(cc_values)
if avg_cc > 10:
print(f"⚠️ Complexity increasing: {avg_cc:.2f}")
else:
print(f"✅ Complexity healthy: {avg_cc:.2f}")
Contributing
- See Development Roadmap for planned work
- Submit PRs against
developbranch (GitFlow) - All PRs require passing quality gates and 100% test coverage
- Follow Governance for workflow details
License
MIT License - see LICENSE file for details
Metrics Definitions
| Metric | Range | Interpretation |
|---|---|---|
| Coverage | 0-100% | Percentage of code lines executed by tests |
| CC | 1-50+ | Function branching complexity; A≤5, B≤10, C≤15, D≤20, E/F>20 |
| MI | 0-100 | Code readability/maintainability; A≥20, B≥10, C≥0 |
| Comment Ratio | 0-100% | Percentage of code that is comments/docstrings |
| Halstead Bugs | 0-N | Estimated number of bugs; lower is better |
| File Size | Lines | Module size; target ≤400 for maintainability |
Status: Production-ready · Latest Release: 1.0.0 · Python: 3.12+
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codebase_stats-0.0.1.tar.gz.
File metadata
- Download URL: codebase_stats-0.0.1.tar.gz
- Upload date:
- Size: 51.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b3040aa4e9d34b7aaec2c6d27aaa9afefdb26e6e84233a591989c2f3fcae33a
|
|
| MD5 |
3c0881c60ae8ca8adbabbfb4755b5d33
|
|
| BLAKE2b-256 |
2e596f8967987f719e0fc488b6bd77e651ee08f0fa960776bffeb90c8c1a60d4
|
File details
Details for the file codebase_stats-0.0.1-py3-none-any.whl.
File metadata
- Download URL: codebase_stats-0.0.1-py3-none-any.whl
- Upload date:
- Size: 32.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3562d940d80185ed97faa904d4562d3006c15c5d13e22af39fbf9747e75f6e73
|
|
| MD5 |
6fdce8e98902d3891cb4347f5ce1db75
|
|
| BLAKE2b-256 |
08b2e0e7a88f0780daf44d426c0d3eaf1bfcb5a28381b5fc3137d3220a68ef4b
|