Skip to main content

A high-performance Python dependency mapping tool written in Rust, powered by the Ruff parser.

Project description

py-dependency-mapper

High-performance static analyzer to map Python dependencies — written in Rust and powered by the Ruff parser.


Overview

py-dependency-mapper is a high-performance tool for analyzing static dependencies in Python projects.
It is implemented in Rust and uses the Ruff parser to provide extremely fast and accurate parsing of import graphs.

This makes it ideal for packaging (e.g., serverless deployments), dependency audits, or simply understanding the dependency graph of large applications.


Features

High performance thanks to the Ruff parser.

🧩 Two-phase architecture: indexing and subgraph extraction per entry point.

🎯 Prefix filtering (e.g., ["my_app"]) to reduce noise.

📦 Complete pip package analysis with automatic dependency resolution.

🛃 Customizable manual mappings via TOML files.

🔍 Impact Analysis (Reverse Lookups): Instantly find which files depend on a specific module (ideal for Smart Testing).

🐍 Python API and CLI utilities.

🚀 CI/CD friendly — designed for large projects with hundreds or thousands of files.


Installation

From PyPI

pip install py-dependency-mapper

Basic Usage

The workflow is designed to be efficient:

Indexing Phase — build a map of your entire project (or only the parts you're interested in).
Querying Phase — use that map to instantly resolve the dependencies of specific entry points.
Package Analysis — analyze and resolve dependencies of installed pip packages.


Example Project

/path/to/project/
└── my_app/
    ├── __init__.py
    ├── main.py       # imports utils
    └── utils.py      # has no other local imports

Usage

import py_dependency_mapper
from pprint import pprint

# --- PHASE 1: Indexing (Done once at the start) ---
# This builds a complete map of all files, their hashes, and their imports.
# This is the heavy operation, but it's only done once.
print("Building the project's dependency map...")

dependency_map = py_dependency_mapper.build_dependency_map(
    source_root="/path/to/project",
    project_module_prefixes=["my_app"],
    include_paths=["my_app/"],
    stdlib_list_path="/path/to/stdlib.txt"  # Optional
)
print(f"Map built with {len(dependency_map)} files.")
# Expected output: Map built with 3 files.

# --- PHASE 2: Querying (Done as many times as you need) ---
# Now, for any Lambda or application entry point, you can get
# its specific dependency graph almost instantly.
entry_point = "/path/to/project/my_app/main.py"

print(f"\nGetting dependency graph for: {entry_point}")
# This call is extremely fast because it only queries the in-memory map.
dependency_graph = py_dependency_mapper.get_dependency_graph(
    dependency_map=dependency_map,
    entry_point=entry_point
)

print(f"The entry point requires {len(dependency_graph)} total files.")

# Examine detailed information for each file
for path, file_info in dependency_graph.items():
    print(f"\nFile: {path}")
    print(f"  Hash: {file_info.hash}")
    print(f"  Stdlib imports: {file_info.stdlib_imports}")
    print(f"  Third party imports: {file_info.third_party_imports}")

# --- PHASE 3: Impact Analysis (Reverse Lookup) ---
# Ideal for CI/CD: Determine which tests to run based on changed files.
# If you modify 'utils.py', this tells you exactly which files import it (recursively).
# Accepts a LIST of changed files for batch processing.

changed_files = [
    "/path/to/project/my_app/utils.py",
    "/path/to/project/my_app/main.py"
]
print(f"\nCalculating impact for changes in: {changed_files}")

# Returns a Set of file paths that depend on ANY of the changed files
impacted_files = py_dependency_mapper.find_dependents(
    dependency_map=dependency_map,
    changed_file_paths=changed_files
)

print(f"This change affects {len(impacted_files)} files:")
for path in impacted_files:
    print(f"  -> {path}")

# Example CI Logic:
# tests_to_run = [f for f in impacted_files if f.startswith("tests/") or f.endswith("_test.py")]

PIP Package Dependencies Analysis

The library includes advanced capabilities for analyzing installed pip packages:

Dependency Tree Generation

# Generate dependency tree using uv
uv pip tree > dependencies.txt

# Then convert to JSON format using custom scripts
parse_pip_tree_to_dict(dependencies.txt)

Custom conversion script

def parse_pip_tree_to_dict(tree_output: str) -> dict:
    """
    TODO: https://github.com/astral-sh/uv/issues/4711
    Revisit when this issue is resolved. `uv` will then be able to
    generate JSON output directly, making this parser obsolete.
    """
    dependency_tree = {}
    stack = [(-1, dependency_tree)]

    for line in tree_output.strip().splitlines():
        indentation = len(line) - len(line.lstrip(" │├─└"))

        clean_line = line.lstrip(" │├─└").strip()

        is_duplicate = "(*)" in clean_line
        clean_line = clean_line.replace(" (*)", "").strip()

        match = re.match(r"([\w.-]+)(?:[=v\s]+)([\d.\w+-]+)", clean_line)
        if not match:
            continue

        package_name, version = match.groups()

        while stack[-1][0] >= indentation:
            stack.pop()

        parent_dependencies = stack[-1][1]

        current_package_node = {"version": version, "dependencies": {}}

        parent_dependencies[package_name] = current_package_node

        if not is_duplicate:
            stack.append((indentation, current_package_node["dependencies"]))

    return dependency_tree

PIP Metadata Analysis

# Build pip package metadata
pip_metadata = py_dependency_mapper.build_pip_metadata(
    dependency_tree_json_path="dependencies.json",
    site_packages_path="/path/to/site-packages",
    manual_mapping_path="mappings.toml"  # Optional: custom mappings
)

# Explore available information
print("Import to pip package mapping:")
pprint(pip_metadata.import_to_pip_map)

print("\nPackage information:")
for pkg_name, pkg_info in pip_metadata.pip_package_info_map.items():
    print(f"{pkg_name}: v{pkg_info.version}")
    print(f"  Dependencies: {pkg_info.dependencies}")
    print(f"  Installed paths: {pkg_info.installed_paths}")

Package Set Resolution

# Automatically resolve all dependencies for specific packages
resolved_packages = py_dependency_mapper.resolve_package_set(
    direct_packages=["requests", "numpy", "pandas"],
    pip_metadata=pip_metadata
)

print("Resolved packages with all their dependencies:")
for pkg_name, pkg_info in resolved_packages.items():
    print(f"  {pkg_name} v{pkg_info.version}")

🔧 Manual Mappings with TOML For cases where automatic detection is not sufficient, you can use a TOML file for custom mappings:

# mappings.toml

# Map import names to pip package names
[import_mappings]
"cv2" = "opencv-python"
"sklearn" = "scikit-learn" 
"PIL" = "Pillow"
"yaml" = "PyYAML"

# Additional dependencies that should be included
[extra_dependencies]
"fastapi" = ["uvicorn", "python-multipart"]
"pydantic" = ["email-validator"]

# Additional package paths
[extra_package_paths]
"tensorflow" = ["bin", "include", "lib"]
"gremlinpython" = ["bin", "lib"]

📖 API Reference

build_dependency_map(
    source_root: str,
    project_module_prefixes: List[str],
    include_paths: List[str],
    stdlib_list_path: Optional[str] = None
) -> Dict[str, ProjectFile]

Scans the project and builds the dependency map.

  • source_root: Absolute path to the root of your source code.

  • project_module_prefixes: A list of module prefixes to include in the analysis (e.g., ["my_app"]).

  • include_paths: A list of directories or files (relative to source_root) to begin the scan from.

  • stdlib_list_path: Optional path to a file containing standard library module names.

  • returns: A dictionary mapping file paths to ProjectFile objects.


get_dependency_graph(
    dependency_map: Dict,
    entry_point: str
) -> Dict[str, GraphFileResult]

From the pre-built map, gets the dependency subgraph for a specific entry point.

  • dependency_map: The dictionary returned by build_dependency_map.

  • entry_point: The absolute path to the initial .py file.

  • returns: A dictionary mapping file paths to GraphFileResult objects.


find_dependents(
    dependency_map: Dict,
    changed_file_paths: List[str]
) -> Set[str]

Performs a reverse dependency lookup. Identifies all files in the project that depend on (import) any of the specified files, either directly or indirectly (transitively)

  • dependency_map: The dictionary returned by build_dependency_map.
  • changed_file_paths: A list of absolute paths to the files that were modified.
  • returns: A Set of strings containing the absolute paths of all files that are impacted by the changes.

PIP Package Analysis Functions

build_pip_metadata(
    dependency_tree_json_path: str,
    site_packages_path: str,
    manual_mapping_path: Optional[str] = None
) -> PipMetadata

Builds metadata for installed pip packages from a dependency tree JSON file.

  • dependency_tree_json_path: Path to JSON file containing the dependency tree.

  • site_packages_path: Path to the site-packages directory.

  • manual_mapping_path: Optional path to TOML file with manual mappings.

  • returns: A PipMetadata object containing package information and mappings.

resolve_package_set(
    direct_packages: List[str],
    pip_metadata: PipMetadata
) -> Dict[str, PipPackageInfo]

Resolves all dependencies for a set of direct packages.

  • direct_packages: List of package names to resolve dependencies for.

  • pip_metadata: The PipMetadata object from build_pip_metadata.

  • returns: A dictionary mapping package names to PipPackageInfo objects.


Data Structures

GraphFileResult

Contains information about a Python source file:

  • hash: SHA256 hash of the file content.

  • project_imports: List of imported project modules (file paths).

  • stdlib_imports: List of imported standard library modules.

  • third_party_imports: List of imported third-party packages.

PipMetadata

Contains pip package analysis results:

  • import_to_pip_map: Mapping from import names to pip package names.

  • pip_package_info_map: Mapping from pip package names to package information.

  • extra_dependencies_map: Manual additional dependencies from TOML.

  • extra_paths_map: Manual additional paths from TOML.

PipPackageInfo

Information about a specific pip package:

  • version: Package version string.

  • installed_paths: List of installed file/directory paths.

  • dependencies: List of direct dependency package names


📜 License

This project is licensed under the MIT License.
See the LICENSE file for more details.


🙌 Acknowledgements

This tool would not be possible without the incredible work of the team behind the Ruff project,
whose high-performance parser is the heart of this analyzer.

Ruff's license can be found in licenses/LICENSE-RUFF.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_dependency_mapper-0.1.5.tar.gz (26.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

py_dependency_mapper-0.1.5-cp313-cp313-win_amd64.whl (1.2 MB view details)

Uploaded CPython 3.13Windows x86-64

py_dependency_mapper-0.1.5-cp313-cp313-manylinux_2_34_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

py_dependency_mapper-0.1.5-cp313-cp313-macosx_11_0_arm64.whl (1.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

py_dependency_mapper-0.1.5-cp313-cp313-macosx_10_12_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

File details

Details for the file py_dependency_mapper-0.1.5.tar.gz.

File metadata

  • Download URL: py_dependency_mapper-0.1.5.tar.gz
  • Upload date:
  • Size: 26.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.10.2

File hashes

Hashes for py_dependency_mapper-0.1.5.tar.gz
Algorithm Hash digest
SHA256 0fd69c1348f2d389aea6e7c2755f1c121b806ccbd9f7bfdf665c23e3f2b73428
MD5 e11fdf875d1d3e36d53bbf979dfaf316
BLAKE2b-256 fc4343e1de5366102da9352cb6f843763d1ddc4d2628a123756fa90502511424

See more details on using hashes here.

File details

Details for the file py_dependency_mapper-0.1.5-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for py_dependency_mapper-0.1.5-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 efdebf0fc64b1a3f5c68a79a51449b87f82c1b2b38bd5558afdc17744b59c9ee
MD5 3e2e3b95551ffae04eb7de74b25a5d23
BLAKE2b-256 274b675e5a15708901937fe33746c89b3fee97fada3ffde6c422659d243f98a3

See more details on using hashes here.

File details

Details for the file py_dependency_mapper-0.1.5-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for py_dependency_mapper-0.1.5-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 6951088cbc1dd8c6e762d25d3493bf8cbcf2c7352afb4f39e1a5d7a2daa972c3
MD5 e743d48a6f18485087b576c3055341c0
BLAKE2b-256 51164ae326e469a4727ec49b1e65d8e6f20e5462a86e94416ae907f9c86051c1

See more details on using hashes here.

File details

Details for the file py_dependency_mapper-0.1.5-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for py_dependency_mapper-0.1.5-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 77212f5a727cb69f6f1dd2d43ebb7e1d4c78d33971c2d743bdc7a984edf7c397
MD5 fef0b4a02231b3fd1d1d215215750362
BLAKE2b-256 fd5095c70b29fd8b6d8cd7159c3a8c1638fb4bbc8baa4977922e3b6098b8c4a2

See more details on using hashes here.

File details

Details for the file py_dependency_mapper-0.1.5-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for py_dependency_mapper-0.1.5-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 9c45e580ec0f7107aa90ea5b1f5fc79148c80dedaebe7b75bb790a8f7af872f0
MD5 b91099f7ba7d8cdf413db7678023f7fb
BLAKE2b-256 8e161328361ecc73b9169d538f8ad52c9d4e1939efd03d668698e3f80e571350

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page