Skip to main content

Extract Python models and dependencies into standalone packages

Project description

pydistill

codecov

Extract Python models and their transitive dependencies into standalone, self-contained packages.

Why?

You have a large Python project and want to share some Pydantic models (or any Python classes) with another service without:

  • Publishing your entire codebase
  • Manually copying files and fixing imports
  • Maintaining a separate package by hand

PyDistill automates this: point it at your entry points, and it extracts exactly what's needed into an importable package.

Installation

# With uv
uv add pydistill

# With pip
pip install pydistill

Quick Start

# Extract a model and its dependencies
pydistill \
    --entry myapp.models:User \
    --entry myapp.models:Order \
    --base-package myapp \
    --output-package extracted_models \
    --output-dir ./dist/extracted_models

This will:

  1. Discover all local imports transitively from User and Order
  2. Copy the relevant source files to ./dist/extracted_models/
  3. Rewrite imports (myapp.*extracted_models.*)
  4. Generate __init__.py files so the package is importable
  5. Generate pyproject.toml so the output is installable with pip

Install the extracted output directly:

pip install ./dist/extracted_models

Configuration

CLI Options

pydistill [OPTIONS]

Options:
  -e, --entry MODULE:NAME     Entry point (can be repeated)
  -b, --base-package PACKAGE  Base package to extract from
  -p, --output-package PACKAGE Output package name
  -o, --output-dir DIR        Output directory
  -s, --source-root DIR       Additional source roots (can be repeated)
  -c, --config FILE           Path to config file
  -n, --dry-run               Show what would be extracted
  --clean                     Remove output directory first
  -q, --quiet                 Suppress output
  -f, --filesystem-only       Skip importlib, use only filesystem resolution
  --dist-name NAME            Distribution name in generated pyproject.toml
  --dist-version VERSION      Distribution version in generated pyproject.toml
  --dependency SPEC           Dependency specifier (can be repeated)
  --format                    Format extracted files (default: ruff format)
  --formatter CMD             Custom formatter command (implies --format)
  --version                   Show version
  -h, --help                  Show help

Configuration File

Create a pydistill.toml in your project root:

[pydistill]
entries = [
    "myapp.users.models:User",
    "myapp.orders.models:Order",
]
base_package = "myapp"
output_package = "extracted_models"
output_dir = "./dist/extracted_models"
clean = true
dist_name = "extracted-models"
dist_version = "0.1.0"
dependencies = ["pydantic>=2.0", "email-validator>=2.0"]
# filesystem_only = true  # Enable for uninstallable projects
# format = true           # Format extracted files
# formatter = "ruff format"  # Custom formatter command

Then just run:

pydistill

PyDistill automatically searches for pydistill.toml in the current directory and parent directories.

CLI arguments override config file values.

Example

Given this project structure:

myapp/
├── __init__.py
├── common/
│   ├── __init__.py
│   └── types.py          # Status enum, Address model
├── users/
│   ├── __init__.py
│   └── models.py         # User model (imports common.types)
└── orders/
    ├── __init__.py
    └── models.py         # Order model (imports users.models, common.types)

Running:

pydistill -e myapp.orders.models:Order -b myapp -p extracted -o ./dist/extracted

Produces:

dist/extracted/
├── pyproject.toml
├── __init__.py
├── common/
│   ├── __init__.py
│   └── types.py          # Status, Address (imports rewritten)
├── users/
│   ├── __init__.py
│   └── models.py         # User (imports rewritten)
└── orders/
    ├── __init__.py
    └── models.py         # Order (imports rewritten)

All imports like from myapp.common.types import Status become from extracted.common.types import Status.

Extracting from Uninstallable Projects

Need to extract models from a project that can't be installed in your current environment (e.g., has Windows-only dependencies on macOS)? Use --filesystem-only:

pydistill \
    -e their_app.models:SomeModel \
    -b their_app \
    -p extracted \
    -o ./dist/extracted \
    -s /path/to/their/project \
    --filesystem-only

This skips Python's importlib and resolves modules purely via filesystem search in the specified source roots.

Formatting Extracted Code

The AST-based import rewriting can produce code that's not perfectly formatted. Use --format to automatically format extracted files:

# Use ruff (default)
pydistill -e myapp:Model -b myapp -p out -o ./dist --format

# Use black instead
pydistill ... --format --formatter black

# Custom ruff config
pydistill ... --format --formatter "ruff format --line-length 120"

Formatting failures are non-fatal—extraction will succeed even if the formatter is unavailable.

How It Works

  1. Parse entry points - Resolve module:ClassName to source files
  2. Discover imports - Use Python's ast module to find all import and from ... import statements
  3. BFS traversal - Follow imports transitively within the base package (ignores third-party like pydantic, datetime, etc.)
  4. Rewrite imports - Use ast.NodeTransformer to rewrite module paths
  5. Generate package - Copy files preserving structure, create __init__.py files, and write pyproject.toml

Use Cases

  • Microservices: Share domain models between services without a monorepo
  • API clients: Generate a lightweight client package with just the request/response models
  • CI/CD: Automatically generate model packages when source changes
  • Cross-platform extraction: Extract from projects with platform-specific dependencies using --filesystem-only

Limitations

  • Only follows imports within --base-package (by design)
  • Extracts entire modules, not individual classes (a module containing User and Admin will include both)
  • Relative imports are preserved as-is (they work in the output package since structure is maintained)
  • Comments are not preserved: The AST-based rewriting uses ast.parse() and ast.unparse(), which strips comments from source files. This is acceptable for artifact/tarball use cases but means the extracted code loses inline documentation. Use --format to ensure consistent formatting of the output.
  • Dynamic imports are not traced: Imports using importlib.import_module() or __import__() cannot be statically analyzed
  • Star imports (from x import *): The module path is rewritten correctly, but transitive dependencies imported via __all__ re-exports may not be discovered

Development

# Clone and install
git clone https://github.com/yourorg/pydistill
cd pydistill
uv sync

# Run tests
uv run pytest

# Run pydistill locally
uv run pydistill --help

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydistill-0.2.0.tar.gz (52.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydistill-0.2.0-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file pydistill-0.2.0.tar.gz.

File metadata

  • Download URL: pydistill-0.2.0.tar.gz
  • Upload date:
  • Size: 52.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pydistill-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6f6568808c5f20f65aa0f524fe7406d916f9f6eda31dbfa5f6e9968e0269007a
MD5 80f086eb323e3743cfd457d25909799e
BLAKE2b-256 917344568d773a0ad41de93350471502fe59ecbe0b35dc0163f00714fa4726c3

See more details on using hashes here.

File details

Details for the file pydistill-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pydistill-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pydistill-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 63a1703d7d2be65204844f78e1228027449e9552133b84c8c1455a9ee1b65cfd
MD5 0326496307f2513e3dbf2432549d966e
BLAKE2b-256 fb625b676c3b5414ebaadbe01ad737b9fe2fbd2c11443ad5924d5ccb7c98f9e9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page