Skip to main content

A high-performance Python dependency mapping tool written in Rust, powered by the Ruff parser.

Project description

py-dependency-mapper

High-performance static analyzer to map Python dependencies — written in Rust and powered by the Ruff parser.


Overview

py-dependency-mapper is a high-performance tool for analyzing static dependencies in Python projects.
It is implemented in Rust and uses the Ruff parser to provide extremely fast and accurate parsing of import graphs.

This makes it ideal for packaging (e.g., serverless deployments), dependency audits, or simply understanding the dependency graph of large applications.


Features

:zap: High performance thanks to the Ruff parser.

:jigsaw: Two-phase architecture: indexing and subgraph extraction per entry point.

:dart: Prefix filtering (e.g., ["my_app"]) to reduce noise.

:package: Complete pip package analysis with automatic dependency resolution.

:customs: Customizable manual mappings via TOML files.

:snake: Python API and CLI utilities.

:rocket: CI/CD friendly — designed for large projects with hundreds or thousands of files.


Installation

From PyPI

pip install py-dependency-mapper

Basic Usage

The workflow is designed to be efficient:

Indexing Phase — build a map of your entire project (or only the parts you're interested in).
Querying Phase — use that map to instantly resolve the dependencies of specific entry points.
Package Analysis — analyze and resolve dependencies of installed pip packages.


Example Project

/path/to/project/
└── my_app/
    ├── __init__.py
    ├── main.py       # imports utils
    └── utils.py      # has no other local imports

Usage

import py_dependency_mapper
from pprint import pprint

# --- PHASE 1: Indexing (Done once at the start) ---
# This builds a complete map of all files, their hashes, and their imports.
# This is the heavy operation, but it's only done once.
print("Building the project's dependency map...")

dependency_map = py_dependency_mapper.build_dependency_map(
    source_root="/path/to/project",
    project_module_prefixes=["my_app"],
    include_paths=["my_app/"],
    stdlib_list_path="/path/to/stdlib.txt"  # Optional
)
print(f"Map built with {len(dependency_map)} files.")
# Expected output: Map built with 3 files.

# --- PHASE 2: Querying (Done as many times as you need) ---
# Now, for any Lambda or application entry point, you can get
# its specific dependency graph almost instantly.
entry_point = "/path/to/project/my_app/main.py"

print(f"\nGetting dependency graph for: {entry_point}")
# This call is extremely fast because it only queries the in-memory map.
dependency_graph = py_dependency_mapper.get_dependency_graph(
    dependency_map=dependency_map,
    entry_point=entry_point
)

print(f"The entry point requires {len(dependency_graph)} total files.")

# Examine detailed information for each file
for path, file_info in dependency_graph.items():
    print(f"\nFile: {path}")
    print(f"  Hash: {file_info.hash}")
    print(f"  Stdlib imports: {file_info.stdlib_imports}")
    print(f"  Third party imports: {file_info.third_party_imports}")

PIP Package Dependencies Analysis

The library includes advanced capabilities for analyzing installed pip packages:

Dependency Tree Generation

# Generate dependency tree using uv
uv pip tree > dependencies.txt

# Then convert to JSON format using custom scripts
parse_pip_tree_to_dict(dependencies.txt)

Custom conversion script

def parse_pip_tree_to_dict(tree_output: str) -> dict:
    """
    TODO: https://github.com/astral-sh/uv/issues/4711
    Revisit when this issue is resolved. `uv` will then be able to
    generate JSON output directly, making this parser obsolete.
    """
    dependency_tree = {}
    stack = [(-1, dependency_tree)]

    for line in tree_output.strip().splitlines():
        indentation = len(line) - len(line.lstrip(" │├─└"))

        clean_line = line.lstrip(" │├─└").strip()

        is_duplicate = "(*)" in clean_line
        clean_line = clean_line.replace(" (*)", "").strip()

        match = re.match(r"([\w.-]+)(?:[=v\s]+)([\d.\w+-]+)", clean_line)
        if not match:
            continue

        package_name, version = match.groups()

        while stack[-1][0] >= indentation:
            stack.pop()

        parent_dependencies = stack[-1][1]

        current_package_node = {"version": version, "dependencies": {}}

        parent_dependencies[package_name] = current_package_node

        if not is_duplicate:
            stack.append((indentation, current_package_node["dependencies"]))

    return dependency_tree

PIP Metadata Analysis

# Build pip package metadata
pip_metadata = py_dependency_mapper.build_pip_metadata(
    dependency_tree_json_path="dependencies.json",
    site_packages_path="/path/to/site-packages",
    manual_mapping_path="mappings.toml"  # Optional: custom mappings
)

# Explore available information
print("Import to pip package mapping:")
pprint(pip_metadata.import_to_pip_map)

print("\nPackage information:")
for pkg_name, pkg_info in pip_metadata.pip_package_info_map.items():
    print(f"{pkg_name}: v{pkg_info.version}")
    print(f"  Dependencies: {pkg_info.dependencies}")
    print(f"  Installed paths: {pkg_info.installed_paths}")

Package Set Resolution

# Automatically resolve all dependencies for specific packages
resolved_packages = py_dependency_mapper.resolve_package_set(
    direct_packages=["requests", "numpy", "pandas"],
    pip_metadata=pip_metadata
)

print("Resolved packages with all their dependencies:")
for pkg_name, pkg_info in resolved_packages.items():
    print(f"  {pkg_name} v{pkg_info.version}")

🔧 Manual Mappings with TOML For cases where automatic detection is not sufficient, you can use a TOML file for custom mappings:

# mappings.toml

# Map import names to pip package names
[import_mappings]
"cv2" = "opencv-python"
"sklearn" = "scikit-learn" 
"PIL" = "Pillow"
"yaml" = "PyYAML"

# Additional dependencies that should be included
[extra_dependencies]
"fastapi" = ["uvicorn", "python-multipart"]
"pydantic" = ["email-validator"]

# Additional package paths
[extra_package_paths]
"tensorflow" = ["bin", "include", "lib"]
"gremlinpython" = ["bin", "lib"]

:book: API Reference

build_dependency_map(
    source_root: str,
    project_module_prefixes: List[str],
    include_paths: List[str],
    stdlib_list_path: Optional[str] = None
) -> Dict[str, ProjectFile]

Scans the project and builds the dependency map.

source_root: Absolute path to the root of your source code.

project_module_prefixes: A list of module prefixes to include in the analysis (e.g., ["my_app"]).

include_paths: A list of directories or files (relative to source_root) to begin the scan from.

stdlib_list_path: Optional path to a file containing standard library module names.

returns: A dictionary mapping file paths to ProjectFile objects.


get_dependency_graph(
    dependency_map: Dict,
    entry_point: str
) -> Dict[str, GraphFileResult]

From the pre-built map, gets the dependency subgraph for a specific entry point.

dependency_map: The dictionary returned by build_dependency_map.

entry_point: The absolute path to the initial .py file.

returns: A dictionary mapping file paths to GraphFileResult objects.


PIP Package Analysis Functions

build_pip_metadata(
    dependency_tree_json_path: str,
    site_packages_path: str,
    manual_mapping_path: Optional[str] = None
) -> PipMetadata

Builds metadata for installed pip packages from a dependency tree JSON file.

dependency_tree_json_path: Path to JSON file containing the dependency tree.

site_packages_path: Path to the site-packages directory.

manual_mapping_path: Optional path to TOML file with manual mappings.

returns: A PipMetadata object containing package information and mappings.

resolve_package_set(
    direct_packages: List[str],
    pip_metadata: PipMetadata
) -> Dict[str, PipPackageInfo]

Resolves all dependencies for a set of direct packages.

direct_packages: List of package names to resolve dependencies for.

pip_metadata: The PipMetadata object from build_pip_metadata.

returns: A dictionary mapping package names to PipPackageInfo objects.


Data Structures

GraphFileResult

Contains information about a Python source file:

hash: SHA256 hash of the file content.

project_imports: List of imported project modules (file paths).

stdlib_imports: List of imported standard library modules.

third_party_imports: List of imported third-party packages.

PipMetadata

Contains pip package analysis results:

import_to_pip_map: Mapping from import names to pip package names.

pip_package_info_map: Mapping from pip package names to package information.

extra_dependencies_map: Manual additional dependencies from TOML.

extra_paths_map: Manual additional paths from TOML.

PipPackageInfo

Information about a specific pip package:

version: Package version string.

installed_paths: List of installed file/directory paths.

dependencies: List of direct dependency package names


:scroll: License

This project is licensed under the MIT License.
See the LICENSE file for more details.


:raised_hands: Acknowledgements

This tool would not be possible without the incredible work of the team behind the Ruff project,
whose high-performance parser is the heart of this analyzer.

Ruff's license can be found in licenses/LICENSE-RUFF.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_dependency_mapper-0.1.4.tar.gz (24.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

py_dependency_mapper-0.1.4-cp313-cp313-win_amd64.whl (1.2 MB view details)

Uploaded CPython 3.13Windows x86-64

py_dependency_mapper-0.1.4-cp313-cp313-manylinux_2_34_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

py_dependency_mapper-0.1.4-cp313-cp313-macosx_11_0_arm64.whl (1.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

py_dependency_mapper-0.1.4-cp313-cp313-macosx_10_12_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

File details

Details for the file py_dependency_mapper-0.1.4.tar.gz.

File metadata

  • Download URL: py_dependency_mapper-0.1.4.tar.gz
  • Upload date:
  • Size: 24.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.10.2

File hashes

Hashes for py_dependency_mapper-0.1.4.tar.gz
Algorithm Hash digest
SHA256 bc11e5b5efbfc9c7f49dad8f2149c881fcd1bf03bb8c79bf1065513736c03b03
MD5 cd62cfd829445c6224950bd72fb7a447
BLAKE2b-256 29ccb7d72c62e8de8937432452369161a32ad27e8bfcd1f848ebe2436b55107f

See more details on using hashes here.

File details

Details for the file py_dependency_mapper-0.1.4-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for py_dependency_mapper-0.1.4-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 51d5f8864a7ef1a5988beddab891b183c3e44c550af3f86d9c2eae7915f246ce
MD5 dfd2a913cf59b6fd0195b73acf04d6a1
BLAKE2b-256 50afa93cffdd9f79e1582f256ee0d92dc8db5dbcf1aab8e4fb903a97c7bcfe3a

See more details on using hashes here.

File details

Details for the file py_dependency_mapper-0.1.4-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for py_dependency_mapper-0.1.4-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 a9ce941b2157f2d49b8e3966950bbc6345952d76f44f4d72416dfd11140065a9
MD5 66722c9d9080905939a810188974dfec
BLAKE2b-256 ab8b5d00d2675b775299b23b23147cb38fac901c9b8280ece764d44e4853f693

See more details on using hashes here.

File details

Details for the file py_dependency_mapper-0.1.4-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for py_dependency_mapper-0.1.4-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 be7052355c0347f508792c47486c3d900933a446b702a5055810c8e82e38105d
MD5 520a50bfbc5ac62a8f73d1ebbc6dce55
BLAKE2b-256 eca9a358f7ee72183d3f663eafbeb2a421975a850c1ebb36855dc17f3beb2e9a

See more details on using hashes here.

File details

Details for the file py_dependency_mapper-0.1.4-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for py_dependency_mapper-0.1.4-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 100972dd34b6d6bf233b022d063ea1708a8a7055b0590e31882e713e81684293
MD5 78ccb8a390c1587e67fc68af2974e1dc
BLAKE2b-256 1bde3fedf3cfb068c1eca4fb656a7f8ed47bdda40a10258757875de22d6f0714

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page