Skip to main content

Static Python codebase walkthrough generator for onboarding developers.

Project description

analyze_project

analyze-project helps developers understand unfamiliar Python codebases by generating a step-by-step walkthrough, execution flow, decision tree, data flow summary, and dependency graphs.

It is designed for onboarding into a project without running the target codebase.

What It Generates

Default runs produce human-readable onboarding reports and the existing technical artifacts:

  • walkthrough.md: narrative walkthrough for a new developer
  • execution_flow.json: structured entry point and execution-flow data
  • decision_tree.md: important if / else / try branches
  • data_flow.md: simple assignment-to-call data movement
  • project_analysis.json: machine-readable structural analysis
  • project_analysis.png: module dependency graph
  • project_call_graph.png: cross-module call graph when available
  • project_hotspots.png: most-called functions and methods when available

Requirements

  • Python 3.9+
  • Runtime dependencies are installed automatically from package metadata

Installation

pip install analyze-project

For isolated CLI installs, pipx is a good fit:

pipx install analyze-project

For a local checkout:

python3 -m venv .venv
source .venv/bin/activate
pip install -e .[dev]

Usage

Analyze the current working directory:

analyze-project .

Analyze a specific project:

analyze-project /path/to/python-project

Only generate onboarding reports and skip graph/chart generation:

analyze-project /path/to/python-project --walkthrough-only

Focus on one entry point:

analyze-project /path/to/python-project --entry app.main::main

Only generate the backward-compatible JSON report:

analyze-project /path/to/python-project --json-only

Module execution is also supported:

python -m analyze_project /path/to/python-project

At the end of a normal run, the terminal prints the output directory and direct walkthrough.md path, for example:

Output dir  : /repo/analysis_output/my_project
Walkthrough : /repo/analysis_output/my_project/walkthrough.md

CLI

analyze-project [target] [--output-dir PATH] [--depth N | --max-depth N] [--max-branches N] [--max-entry-trees N] [--hotspots N] [--non-recursive] [--include-glob PATTERN] [--exclude-dir NAME] [--entry ENTRYPOINT] [--show-unresolved] [--summary-only | --json-only | --walkthrough-only]
  • target: Directory to analyze. Defaults to the current working directory.
  • --output-dir PATH: Output directory for generated artifacts. Defaults to ./analysis_output/<project_name>/.
  • --depth N, --max-depth N: Maximum call tree and walkthrough depth.
  • --max-branches N: Maximum call-tree branches per node.
  • --max-entry-trees N: Maximum number of technical entry-point call trees printed.
  • --hotspots N: Number of most-called callables to include.
  • --non-recursive: Only analyze Python files in the top-level target directory.
  • --include-glob PATTERN: Repeatable relative-path glob filter for .py files.
  • --exclude-dir NAME: Repeatable directory name to exclude in addition to the built-in skip list.
  • --entry ENTRYPOINT: Generate walkthrough details only for a matching entry point.
  • --show-unresolved: Print unresolved call examples in the terminal summary.
  • --summary-only: Print the technical terminal summary without writing artifacts.
  • --json-only: Only write project_analysis.json; no walkthrough or Markdown files are written.
  • --walkthrough-only: Only write walkthrough.md, execution_flow.json, decision_tree.md, and data_flow.md.

Walkthrough Content

walkthrough.md includes:

  • Project Purpose Guess
  • How to Run
  • Entry Points ranked by confidence and source
  • Execution Walkthrough
  • Decision Points
  • Data Flow
  • Side Effects
  • Final Outcome
  • Unknowns / Static Analysis Limits

The analyzer ranks entry points in this order:

  1. console_scripts from package metadata
  2. if __name__ == "__main__"
  3. FastAPI or Flask app objects and route handlers
  4. main(), run(), start()
  5. handler() and lambda_handler()

Static Analysis Limits

The target project is never executed. The analyzer uses Python AST parsing and metadata inspection, so some behavior cannot be known with certainty.

Known limitations:

  • Dynamic imports, monkey patching, runtime dependency injection, and reflection may be missed.
  • Deep data-flow and alias tracking are intentionally shallow.
  • External library behavior is summarized as side effects instead of expanded into the local execution tree.
  • Unknown object calls may appear as unresolved even when they are valid at runtime.
  • Secrets are redacted from output, but avoid analyzing real .env files if you do not want their variable names listed.

Security And Secrets

Values that look like tokens, passwords, API keys, credentials, or secrets are masked in generated reports.

Example:

TELEGRAM_BOT_TOKEN=***

Development

Install development dependencies and run checks:

pip install -e .[dev]
python -m py_compile analyze_project.py tests/test_cli.py
ruff check .
pytest
rm -rf dist build *.egg-info
python -m build
python -m twine check dist/*

Publishing

Before publishing a new version to PyPI, bump version in pyproject.toml, clean old build artifacts, build, and upload:

rm -rf dist build *.egg-info
python -m build
python -m twine check dist/*
python -m twine upload dist/*

0.1.0 cannot be uploaded again once published; use a new version such as 0.1.1.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

analyze_project-0.1.1.tar.gz (32.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

analyze_project-0.1.1-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file analyze_project-0.1.1.tar.gz.

File metadata

  • Download URL: analyze_project-0.1.1.tar.gz
  • Upload date:
  • Size: 32.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for analyze_project-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8bb548a43e7294a0838d1df42927c09825a56de931c44fb0ef81efed86e51521
MD5 4848685ba7a364bec9cba27f554b8c5e
BLAKE2b-256 35660b0a5df14a963b906f8eb7255b977a513927d48136ec3e6e4508a131da7c

See more details on using hashes here.

File details

Details for the file analyze_project-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for analyze_project-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2de8d8ee953b201aa07c42b5c7937a072f6a75bca6852f220800057d4453906f
MD5 9bf8aef697455b6985f3c6f178b0f749
BLAKE2b-256 614f369221e9fc2030a260f679da693d317bdebb8a1f42695ea13bf007368e86

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page