Skip to main content

A tool for visualizing and analyzing code repositories

Project description

Code Cartographer

Automated Deep Multilayer Code Analysis and Optimization for Large Scale Codebases


As a masochist, I am never satisfied with a project until I reach perfection.
To me, perfection is so far beyond running correctly or achieving 0 problems across all files in a directory.
Although I've never actually achieved the elusive "perfect" finish line in any project ever, so I can't be sure about the definition. It's also just what happens when you're experimenting. When you're deep in research, you're iterating in a vacuum; by the time something works, you've rewritten it five times. Therefore, I unsurprisingly constantly have at least a few iterations of the same project on my local machine that I'm actively making more refinements to at all times.
This can cause confusion (shocker!), especially because I am reluctant to push or publish incomplete or "inadequate" code. I also have a fear of being perceived—and what's more vulnerable than my code?

"For when your code is too chaotic for flake8 and too personal for git push."

Just like that, a vicious cycle is born.
An unproductive, deeply confusing, memory-consuming vicious cycle.

Unfortunately the cycle is much harder to follow when there are dozens of moving parts in dozens of subfolders, each with (dozens) of lengthy scripts. You could feed your directory setup as context in an attempt to gain some clarity, but that's a gamble that backfires 9 times out of 10. Has any LLM ever in all of human history actually internalized any tree structure to assist you in reorganizing a repo? Yeah, I didn't think so. Not for me either. So if a vicious cycle is now my daily routine, at a certain point I decided to give myself the illusion of respite, however brief. This has given me solace at least once. Enjoy!


Code Cartographer

"If Git is for branches, this is for forks of forks."


Features

  • Full file and definition level metadata
    Class/function blocks, line counts, docstrings, decorators, async flags, calls, type hints

  • Intelligent Code Normalization
    Standardizes variable names, function signatures, and code structure for better comparison

  • Advanced Variant Detection & Merging
    Automatically identifies and merges similar code blocks with semantic analysis

  • Auto-patching System
    Safely applies merged variants with automatic backup creation

  • Function/class SHA-256 hashes
    Detects variants, clones, and partial rewrites across versions

  • Cyclomatic complexity & maintainability index analysis (via radon)
    Flags "at-risk" code with CC > 10 or MI < 65

  • Auto-generated LLM refactor prompts
    Variant grouping, inline diffs, rewrite guidance

  • Internal dependency graph
    Outputs a Graphviz .dot of all intra-project imports

  • Markdown summary
    Skimmable digest with risk flags and structure

  • Interactive Dashboard
    Visual analysis of code complexity, variants, and dependencies

  • CLI flexibility Exclusion patterns, Git SHA tagging, output formatting, variant merging controls

Temporal Topography - NEW!

Immersive Temporal Code Visualization Platform

Code Cartographer now includes Temporal Topography, a powerful web-based interface for exploring your codebase's evolution through time:

  • Temporal Analysis: Navigate git history and track code evolution
  • Interactive Visualizations: Complexity trends, dependency graphs, file hotspots
  • Refactoring Detection: Automatically identify renames, splits, merges, and extractions
  • Modern Web UI: Fast, responsive interface built with FastAPI and vanilla JavaScript
  • Real-time Updates: WebSocket-based progress tracking during analysis
  • Complexity Evolution: See how code complexity changes over time

Quick Start with Temporal Topography

# Start the web server
python -m code_cartographer serve

# Open browser to http://localhost:8000
# Click "Analyze Project" and enter your project path
# Explore your codebase through interactive visualizations

See Temporal Topography Documentation for complete guide and API reference.


Setup & Installation

Prerequisites

  • Python 3.10+
  • (Optional) Graphviz for dependency visualization

Installation

You can install code-cartographer directly from PyPI:

pip install code-cartographer

For development installation:

  1. Clone the Repository
git clone https://github.com/stochastic-sisyphus/code-cartographer.git
cd code-cartographer
  1. Install Dependencies

Recommended: Use mise for automatic environment setup

mise trust
mise install
mise run install

Or manually:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e ".[dev]"

Usage Guide

Quick Start

The code-cartographer package provides tools for analyzing Python codebases:

from code_cartographer import ProjectAnalyzer, VariantAnalyzer

# Initialize analyzers
project_analyzer = ProjectAnalyzer("/path/to/your/project")
variant_analyzer = VariantAnalyzer()

# Run analysis
analysis_results = project_analyzer.analyze()
variant_results = variant_analyzer.analyze(analysis_results)

# Generate reports
project_analyzer.generate_markdown("analysis.md")
project_analyzer.generate_dependency_graph("dependencies.dot")

Advanced Usage

1. Deep Code Analysis

from code_cartographer import ProjectAnalyzer

analyzer = ProjectAnalyzer(
    project_dir="/path/to/project",
    exclude_patterns=["tests/.*", "build/.*"]
)

# Run analysis
results = analyzer.analyze()

# Generate reports
analyzer.generate_markdown("summary.md")
analyzer.generate_dependency_graph("deps.dot")

2. Code Variant Analysis

from code_cartographer import VariantAnalyzer

# Initialize analyzer with custom settings
analyzer = VariantAnalyzer(
    root="/path/to/project",
    semantic_threshold=0.8,  # 80% similarity required
    min_lines=5  # Minimum lines for variant consideration
)

# Run analysis
results = analyzer.analyze()

# Apply merged variants (with automatic backups)
analyzer.apply_merged_variants(backup=True)

3. CLI Variant Management

# Analyze and merge variants with backups
code-cartographer variants -d /path/to/project --apply-merges

# Analyze and merge variants without backups
code-cartographer variants -d /path/to/project --apply-merges --no-backup

# Analyze with custom similarity threshold
code-cartographer variants -d /path/to/project --semantic-threshold 0.9

Output Structure

After analysis, you'll find:

analyzed-project/
├── analysis.md         # Human-readable summary
├── dependencies.dot    # Dependency graph (if Graphviz is installed)
└── variants.md        # Code variant analysis report

Key Metrics

  • Code complexity metrics
  • Import dependencies
  • Function/class definitions
  • Documentation coverage
  • Code variants and duplicates
  • Semantic similarity scores

Best Practices

1. Regular Analysis

from code_cartographer import ProjectAnalyzer
import datetime

# Add to your analysis pipeline
analyzer = ProjectAnalyzer(".")
results = analyzer.analyze()
date_str = datetime.datetime.now().strftime("%Y%m%d")
analyzer.generate_markdown(f"analysis-{date_str}.md")

2. Large Projects

from code_cartographer import ProjectAnalyzer

# Analyze specific directories with exclusions
analyzer = ProjectAnalyzer(
    "src",
    exclude_patterns=[
        "tests/.*",
        "docs/.*",
        "*.pyc",
        "__pycache__/.*"
    ]
)
results = analyzer.analyze()

Troubleshooting

Memory Issues

  • Reduce analysis scope using exclude patterns
  • Process directories sequentially for large projects
  • Use the similarity threshold in VariantAnalyzer to limit comparisons

Performance Tips

  • Focus analysis on specific directories
  • Use appropriate similarity thresholds
  • Leverage code normalization options

Common Issues

  • Ensure Python 3.8+ is being used
  • Check file permissions for output directories
  • Verify Graphviz installation for dependency graphs

Author Notes

This tool exists to reconcile broken, duplicated, or ghost-forked Python projects. It helps you detect what's salvageable, refactor what's duplicated, and visualize the mess you made.

Whether you're dealing with:

  • Fragmented directories
  • Local edits lost to time
  • Abandoned branches and reanimated scripts

This is for you. Or at least, for the version of you that still wants to fix it.

"Structured remorse for unstructured code."


License

MIT License. See LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_cartographer-0.3.0.tar.gz (85.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

code_cartographer-0.3.0-py3-none-any.whl (65.1 kB view details)

Uploaded Python 3

File details

Details for the file code_cartographer-0.3.0.tar.gz.

File metadata

  • Download URL: code_cartographer-0.3.0.tar.gz
  • Upload date:
  • Size: 85.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for code_cartographer-0.3.0.tar.gz
Algorithm Hash digest
SHA256 83f2d79c3b4a807e90826db3f0d38e589afa03fe94d659e889d5ddbe04369026
MD5 55d7eb42d3ac5c9acb297bf7cbca3265
BLAKE2b-256 9602bf99982f7885e492867c2d9721ee69c599073a35d46f0d704f594cf6027c

See more details on using hashes here.

Provenance

The following attestation bundles were made for code_cartographer-0.3.0.tar.gz:

Publisher: publish.yml on stochastic-sisyphus/code-cartographer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file code_cartographer-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for code_cartographer-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fad646cde5aee6cf3b48b17459385fe6308917cf5db558d93d964e8d717e681f
MD5 b27110031a7d369798a43b5c7eeebb2f
BLAKE2b-256 a0a2bf9316636168508abdd40a06f391ba8c707dd2c0554d7d128d28c9487359

See more details on using hashes here.

Provenance

The following attestation bundles were made for code_cartographer-0.3.0-py3-none-any.whl:

Publisher: publish.yml on stochastic-sisyphus/code-cartographer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page