Skip to main content

Static import graph + PageRank for Python projects

Project description

Python Page Rank

Static dependency analysis and PageRank-style importance scoring for Python modules.

This tool scans a Python codebase, builds a directed import graph between modules, and computes a PageRank-like score indicating which modules are most structurally important based on how much other code depends on them.

No code execution and no imports.

Quickstart

Install from PyPI:

pip install python-page-rank
  • Run on a project directory: python-page-rank .
  • Show the top 5 ranked modules: python-page-rank . --n 5
  • Emit JSON instead of a table: python-page-rank . --json

Run from src:

python cli.py /path/to/project

Example

python-page-rank /path/to/project

output:

Module                                     LOC  Importers   PageRank
----------------------------------------------------------------------
MyApp\authentication\models.py              29         12   0.114471
MyApp\common\utils.py                       66          6   0.094118
MyApp\payments\models.py                   168         15   0.053272
...

CLI usage

The tool is intended to be run from the command line.

Syntax

python-page-rank [path] [--n N] [--alpha A] [--iters I] [--json] [--include-init] 

Arguments

path
Project root directory to scan.
Optional. Defaults to the current directory (.).

--n N
Number of top-ranked files to display.
Optional. Default is 10.

--alpha A
PageRank damping factor.
Optional. Default is 0.85.

--iters ITERS
Number of PageRank iterations to run.
Optional. Default is 50.

--json
Prints results as JSON instead of a text table.

--include-init
Include __init__.py files in the output.
By default, package initializer files are hidden to reduce noise.

Examples

Analyze the current directory with default settings: python-page-rank
Analyze a specific project directory: python-page-rank MyApp
Show only the top 10 ranked files: python-page-rank --n 10
Use a custom damping factor and iteration count: python-page-rank --alpha 0.9 --iters 100
Analyze a directory and output results as JSON: python-page-rank MyApp --json
Combine options: python-page-rank MyApp --n 20 --alpha 0.8 --iters 75 --json

What it does

  • Recursively scans a project directory for .py files
  • Parses files using AST (no code execution)
  • Builds a module-level dependency graph from import / from X import Y
  • Computes PageRank over that graph
  • Outputs per-file:
    • lines of code (LOC)
    • number of importers
    • PageRank score

Typical uses:

  • identify central / risky modules
  • guide refactors
  • estimate blast radius of changes
  • understand large or unfamiliar codebases

What it does not do (by design)

  • Execute or import any project code
  • Modify files
  • Resolve runtime-dependent imports
  • Guess ambiguous dependencies

This is a conservative static analyzer.

How imports are handled

Supported

  • import package.module
  • import package.module as alias
  • from package import module
  • from package.subpackage import thing

All resolved to absolute module paths and mapped to files when possible.

Intentionally ignored

  • from . import models
  • from .. import utils

These produce no edge in the graph.
Relative imports with an explicit module (e.g. from .models import X) are handled.

Why:
The AST provides no absolute module path (node.module is None). Resolving these requires guessing project roots and can be wrong in multi-root or namespace-package layouts. This behavior is deliberate and covered by tests.

Directory handling

The scanner:

  • automatically detects multiple import roots
  • prefers the shortest valid module path when duplicates exist
  • skips common junk directories:
    • .git, .venv, venv, node_modules, __pycache__, etc.

This allows it to work on:

  • monorepos
  • Django / FastAPI projects
  • repos with multiple top-level packages

PageRank details

  • Standard iterative PageRank
  • Damping factor: 0.85
  • Teleportation distributes rank uniformly
  • Ranks are normalized to sum to 1.0

Interpretation:
A module is "important" if many important modules depend on it.

This measures structural importance, not runtime usage frequency.

Notes

  • The analysis is fully static
  • No code is executed
  • Imports are resolved via AST parsing only
  • PYTHONPATH and runtime environment are ignored

Tests

Tests cover:

  • absolute import resolution
  • multi-root handling
  • preference of shortest module paths
  • ignored relative dot-imports
  • stability of PageRank output

Run tests with:

python -m pytest

Limitations (read before filing issues)

  • Relative imports without explicit module paths are ignored
  • Namespace packages may require manual root layout
  • Large repos may take time due to full AST parsing
  • No attempt is made to infer dynamic imports

These are tradeoffs, not bugs.

License

MIT License

Disclaimer / Caveat

If you need aggressive inference, this tool is intentionally not that.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_page_rank-0.1.3.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_page_rank-0.1.3-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file python_page_rank-0.1.3.tar.gz.

File metadata

  • Download URL: python_page_rank-0.1.3.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for python_page_rank-0.1.3.tar.gz
Algorithm Hash digest
SHA256 4a4e8164f2515d89d37df601ac06446ee9e5e9241bb03661ddf90b803e3e800a
MD5 cdd72878371f3d2df0ff9a03a7ad2ade
BLAKE2b-256 95095fe7ca3d14f4bdd3887264bd1faf378957838e7dc16a73d78b0c4223374a

See more details on using hashes here.

File details

Details for the file python_page_rank-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for python_page_rank-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 862dfbe28a4007c7670f69bd28c91a54d1fdc56c4cde3be2589711b818f7df4c
MD5 31b2415e975daed935f7c7a6d34733d1
BLAKE2b-256 5b7bf27e89e561c1bc6b7a7e3294e9240233996513260dcf4544ff034e315eea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page