Skip to main content

Extract comprehensive metadata from Albumentations transforms including parameters, types, constraints, and docstrings

Project description

albu-spec

Extract comprehensive metadata from AlbumentationsX transforms including parameter names, types, constraints, and docstrings.

Features

  • Parameter Extraction: Extract parameter names, types, and default values from __init__ signatures
  • Deep Constraint Analysis: Parse Pydantic Field constraints (ge, le, gt, lt, etc.)
  • Validator Introspection: Extract information from AfterValidator bounds and custom validators
  • Structured Docstring Parsing: Parse Google-style docstrings into structured sections (args, examples, notes, warnings, references, etc.)
  • Complete Metadata: Get transform type, supported targets, and module information
  • BBox Type Support: Extract supported bounding box types (HBB, OBB) for dual transforms
  • Type Safety: All data returned as typed Pydantic models
  • JSON Serializable: Export all metadata as JSON for APIs and databases

Installation

pip install albu-spec

Note: This package requires albumentationsx to be installed separately, as it's designed to introspect an existing AlbumentationsX installation:

pip install albumentationsx

Quick Start

Extract Metadata for a Single Transform

import albumentations as A
from albu_spec import get_transform_metadata

# Get metadata for Affine transform
metadata = get_transform_metadata(A.Affine)

print(f"Transform: {metadata.name}")
print(f"Type: {metadata.transform_type}")
print(f"Module: {metadata.module}")
print(f"Targets: {metadata.targets}")
print(f"Has InitSchema: {metadata.has_init_schema}")
print(f"Supported BBox Types: {metadata.supported_bbox_types}")

Output:

Transform: Affine
Type: dual
Module: albumentations.augmentations.geometric.transforms
Targets: ['image', 'mask', 'bboxes', 'keypoints', 'volume', 'mask3d']
Has InitSchema: True
Supported BBox Types: ['hbb', 'obb']

Full metadata as JSON:

{
  "name": "Affine",
  "module": "albumentations.augmentations.geometric.transforms",
  "transform_type": "dual",
  "targets": ["image", "mask", "bboxes", "keypoints", "volume", "mask3d"],
  "parameters": {
    "scale": {
      "name": "scale",
      "type_hint": "tuple[float, float] | float | dict[str, float | tuple[float, float]]",
      "default": [1.0, 1.0],
      "description": "Scaling factor to use, where ``1.0`` denotes \"no change\" and ``0.5`` is zoomed out to ``50`` percent of the original size...",
      "constraints": null
    },
    "rotate": {
      "name": "rotate",
      "type_hint": "tuple[float, float] | float",
      "default": 0.0,
      "description": "Rotation in degrees (**NOT** radians), i.e. expected value range is around ``[-360, 360]``...",
      "constraints": null
    },
    "interpolation": {
      "name": "interpolation",
      "type_hint": [0, 1, 2, 3, 4],
      "default": 1,
      "description": "OpenCV interpolation flag.",
      "constraints": null
    },
    "p": {
      "name": "p",
      "type_hint": "float",
      "default": 0.5,
      "description": "probability of applying the transform. Default: 0.5.",
      "constraints": {
        "ge": 0.0,
        "le": 1.0,
        "gt": null,
        "lt": null,
        "min_length": null,
        "max_length": null,
        "multiple_of": null,
        "min_value": null,
        "max_value": null,
        "pattern": null,
        "validators": [],
        "validator_info": {}
      }
    }
  },
  "docstring": "Augmentation to apply affine transformations to images...",
  "docstring_short": "Augmentation to apply affine transformations to images.",
  "has_init_schema": true
}

(Note: Some parameters omitted for brevity)

Inspect Individual Parameters

# Check parameter details
for param_name, param_info in metadata.parameters.items():
    print(f"{param_name}:")
    print(f"  Type: {param_info.type_hint}")
    print(f"  Default: {param_info.default}")
    if param_info.constraints:
        print(f"  Constraints: {param_info.constraints}")

Extract All Transforms

from albu_spec import get_all_transforms_metadata

# Get all transforms grouped by type
collection = get_all_transforms_metadata()

print(f"Total transforms: {collection.total_count}")
print(f"Image-only transforms: {len(collection.image_only)}")
print(f"Dual transforms: {len(collection.dual)}")
print(f"3D transforms: {len(collection.transforms_3d)}")

# Iterate through all transforms
for transform in collection.get_all():
    print(f"{transform.name} ({transform.transform_type})")

Check Bounding Box Type Support

from albu_spec import get_transform_metadata
import albumentations as A

# Check which bbox types a transform supports
transforms_to_check = [A.Affine, A.Rotate, A.CenterCrop, A.ColorJitter]

for transform_class in transforms_to_check:
    metadata = get_transform_metadata(transform_class)
    if metadata.supported_bbox_types:
        print(f"{metadata.name}: {metadata.supported_bbox_types}")
    else:
        print(f"{metadata.name}: No bbox support (not a dual transform)")

Output:

Affine: ['hbb', 'obb']
Rotate: ['hbb', 'obb']
CenterCrop: ['hbb']
ColorJitter: No bbox support (not a dual transform)

Detailed Examples

Examining Parameter Constraints

from albu_spec import get_transform_metadata
import albumentations as A

# Get GlassBlur metadata
metadata = get_transform_metadata(A.GlassBlur)

# Check sigma parameter
sigma_param = metadata.parameters['sigma']
print(f"Parameter: {sigma_param.name}")
print(f"Type: {sigma_param.type_hint}")
print(f"Default: {sigma_param.default}")

if sigma_param.constraints:
    print(f"Constraints:")
    if sigma_param.constraints.ge is not None:
        print(f"  >= {sigma_param.constraints.ge}")
    if sigma_param.constraints.le is not None:
        print(f"  <= {sigma_param.constraints.le}")

Accessing Validator Information

from albu_spec import get_transform_metadata
import albumentations as A

# Get MotionBlur metadata
metadata = get_transform_metadata(A.MotionBlur)

# Check angle_range parameter
angle_param = metadata.parameters['angle_range']

if angle_param.constraints and angle_param.constraints.validator_info:
    print("Validator information:")
    for validator_name, validator_data in angle_param.constraints.validator_info.items():
        print(f"  {validator_name}: {validator_data}")

Working with Structured Docstrings

from albu_spec import get_transform_metadata
import albumentations as A

# Get metadata with parsed docstring
metadata = get_transform_metadata(A.Blur)

if metadata.docstring_parsed:
    parsed = metadata.docstring_parsed

    # Short description for preview cards
    print(f"Description: {parsed.short_description}")

    # Parameters with types and descriptions
    print("\nParameters:")
    for arg in parsed.args:
        print(f"  {arg.name} ({arg.type}): {arg.description}")

    # Code examples
    if parsed.examples:
        print(f"\nFound {len(parsed.examples)} example(s)")
        print("First example:")
        print(parsed.examples[0][:200] + "...")

    # Additional sections
    if parsed.notes:
        print(f"\nNotes: {parsed.notes}")

    if parsed.warnings:
        print(f"\nWarnings: {parsed.warnings}")

    if parsed.references:
        print(f"\nReferences: {parsed.references}")

    # Extra sections (Image types, Targets, Mathematical Formulation, etc.)
    if parsed.extra_sections:
        print("\nExtra sections:")
        for section_name, section_content in parsed.extra_sections.items():
            print(f"  {section_name}: {section_content[:100]}...")

Export to JSON

from albu_spec import get_all_transforms_metadata
import json

# Get all transforms
collection = get_all_transforms_metadata()

# Convert to dict and export
data = collection.model_dump()

with open("transforms_metadata.json", "w") as f:
    json.dump(data, f, indent=2)

print("Metadata exported to transforms_metadata.json")

Filter Transforms by Criteria

from albu_spec import get_all_transforms_metadata

collection = get_all_transforms_metadata()

# Find all transforms with InitSchema
transforms_with_schema = [
    t for t in collection.get_all()
    if t.has_init_schema
]

print(f"Transforms with InitSchema: {len(transforms_with_schema)}")

# Find all transforms that support bboxes
transforms_with_bboxes = [
    t for t in collection.get_all()
    if "bboxes" in t.targets
]

print(f"Transforms supporting bboxes: {len(transforms_with_bboxes)}")

# Find transforms that support OBB (oriented bounding boxes)
transforms_with_obb = [
    t for t in collection.dual
    if t.supported_bbox_types and "obb" in t.supported_bbox_types
]

print(f"Transforms supporting OBB: {len(transforms_with_obb)}")
for t in transforms_with_obb[:5]:
    print(f"  - {t.name}: {t.supported_bbox_types}")

Data Models

TransformMetadata

Complete metadata for a transform:

class TransformMetadata(BaseModel):
    name: str  # Transform class name
    module: str  # Module path
    transform_type: Literal["image_only", "dual", "transforms_3d", "unknown"]
    targets: list[str]  # Supported targets
    parameters: dict[str, ParameterMetadata]  # Parameter metadata
    docstring: str | None  # Complete docstring (raw)
    docstring_short: str | None  # Short description
    docstring_parsed: ParsedDocstring | None  # Structured parsed docstring
    has_init_schema: bool  # Whether InitSchema exists
    supported_bbox_types: list[str] | None  # Supported bbox types (hbb, obb) for dual transforms

ParameterMetadata

Metadata for a single parameter:

class ParameterMetadata(BaseModel):
    name: str  # Parameter name
    type_hint: str | list[str]  # Type annotation
    default: Any  # Default value
    description: str | None  # Description from docstring
    constraints: ConstraintInfo | None  # Pydantic constraints

ConstraintInfo

Constraint information from Pydantic:

class ConstraintInfo(BaseModel):
    ge: float | None  # Greater than or equal to
    le: float | None  # Less than or equal to
    gt: float | None  # Greater than
    lt: float | None  # Less than
    min_length: int | None  # Minimum length
    max_length: int | None  # Maximum length
    multiple_of: float | None  # Must be multiple of
    min_value: float | None  # Min value (from validators)
    max_value: float | None  # Max value (from validators)
    pattern: str | None  # Regex pattern
    validators: list[str]  # Validator function names
    validator_info: dict[str, Any]  # Additional validator info

ParsedDocstring

Structured parsed docstring with all sections:

class ParsedDocstring(BaseModel):
    short_description: str | None  # First paragraph
    long_description: str | None  # Extended description
    args: list[DocstringArg]  # Parsed arguments
    returns: DocstringReturn | None  # Return value info
    raises: list[DocstringRaises]  # Exceptions
    yields: DocstringReturn | None  # Yield info (generators)
    examples: list[str]  # Code examples
    notes: str | None  # Additional notes
    warnings: str | None  # User warnings
    see_also: str | None  # Related items
    references: str | None  # Citations/links
    attributes: list[DocstringArg]  # Class attributes
    extra_sections: dict[str, Any]  # All other sections (Image types, Targets, etc.)

Note: extra_sections captures ALL docstring sections not explicitly handled above. AlbumentationsX transforms use 90+ custom section names like "Image types", "Targets", "Mathematical Formulation", "Number of channels", etc. These are automatically captured in extra_sections dict, making the parser future-proof for any new sections.

TransformCollection

Collection of transforms grouped by type:

class TransformCollection(BaseModel):
    image_only: list[TransformMetadata]
    dual: list[TransformMetadata]
    transforms_3d: list[TransformMetadata]
    unknown: list[TransformMetadata]

    @property
    def total_count(self) -> int:
        """Total number of transforms"""

    def get_all(self) -> list[TransformMetadata]:
        """Get all transforms as a flat list"""

Use Cases

Documentation Generation

Generate comprehensive API documentation for Albumentations transforms:

from albu_spec import get_all_transforms_metadata

collection = get_all_transforms_metadata()

for transform in collection.image_only:
    print(f"## {transform.name}\n")

    if transform.docstring_parsed:
        # Use structured docstring
        parsed = transform.docstring_parsed
        print(f"{parsed.short_description}\n")

        print("### Parameters\n")
        for arg in parsed.args:
            print(f"- **{arg.name}** (`{arg.type}`)")
            if arg.description:
                print(f"  - {arg.description}")

        # Include examples if available
        if parsed.examples:
            print("\n### Examples\n")
            for example in parsed.examples:
                print(f"```python\n{example}\n```\n")

        # Include notes if available
        if parsed.notes:
            print(f"\n### Notes\n\n{parsed.notes}\n")

UI Generation

Build dynamic UIs for transform configuration:

from albu_spec import get_transform_metadata
import albumentations as A

metadata = get_transform_metadata(A.Blur)

# Generate UI controls based on parameter types and constraints
for param_name, param in metadata.parameters.items():
    if param.type_hint == "int" and param.constraints:
        # Create slider with min/max from constraints
        min_val = param.constraints.ge or param.constraints.gt or 0
        max_val = param.constraints.le or param.constraints.lt or 100
        print(f"Slider for {param_name}: range({min_val}, {max_val})")
    elif isinstance(param.type_hint, list):
        # Create dropdown for Literal types
        print(f"Dropdown for {param_name}: options={param.type_hint}")

Website/Documentation Backend

Generate structured data for documentation websites:

from albu_spec import get_transform_metadata
import albumentations as A
import json

metadata = get_transform_metadata(A.Blur)

# Create structured data for website rendering
doc_data = {
    "name": metadata.name,
    "type": metadata.transform_type,
    "description": metadata.docstring_parsed.short_description if metadata.docstring_parsed else "",
    "parameters": [],
    "examples": [],
    "notes": None,
}

if metadata.docstring_parsed:
    parsed = metadata.docstring_parsed

    # Parameter table data
    for arg in parsed.args:
        doc_data["parameters"].append({
            "name": arg.name,
            "type": arg.type,
            "description": arg.description,
            "default": metadata.parameters[arg.name].default if arg.name in metadata.parameters else None,
        })

    # Code examples
    doc_data["examples"] = [{"language": "python", "code": ex} for ex in parsed.examples]

    # Notes/warnings
    doc_data["notes"] = parsed.notes

# Export as JSON
print(json.dumps(doc_data, indent=2))

Validation Testing

Test transform initialization with various parameter values:

from albu_spec import get_transform_metadata
import albumentations as A

metadata = get_transform_metadata(A.GlassBlur)

# Test edge cases based on constraints
for param_name, param in metadata.parameters.items():
    if param.constraints:
        print(f"Testing {param_name}:")

        if param.constraints.ge is not None:
            print(f"  Min value: {param.constraints.ge}")
            # Test with min value

        if param.constraints.le is not None:
            print(f"  Max value: {param.constraints.le}")
            # Test with max value

Requirements

  • Python >= 3.10
  • pydantic >= 2.0
  • google-docstring-parser >= 0.0.8
  • typing-extensions >= 4.0
  • albumentationsx (installed separately, imports as albumentations)

Contributing

Contributions are welcome! Before submitting your first contribution, please:

  1. Read our Contributing Guide
  2. Sign the Contributor License Agreement (CLA)

For questions, open an issue or email vladimir@albumentations.ai

License

Dual License:

  • AGPL-3.0 for open source use
  • Commercial License for proprietary/commercial applications

For licensing questions, contact: vladimir@albumentations.ai

Related Projects

Credits

Developed by Vladimir Iglovikov and the Albumentations team.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

albu_spec-0.0.4.tar.gz (40.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

albu_spec-0.0.4-py3-none-any.whl (38.8 kB view details)

Uploaded Python 3

File details

Details for the file albu_spec-0.0.4.tar.gz.

File metadata

  • Download URL: albu_spec-0.0.4.tar.gz
  • Upload date:
  • Size: 40.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for albu_spec-0.0.4.tar.gz
Algorithm Hash digest
SHA256 1e15dd2806f55dc01310a72de7cfa373b0f2854547bdbc48cfcb57d42ee959d4
MD5 575aadbc39872a58c4df8bbc77329f72
BLAKE2b-256 ee2adad6fd1505e95c5c799da246b042ebbc2f16f3879cc3abcc4f93fa3e6501

See more details on using hashes here.

File details

Details for the file albu_spec-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: albu_spec-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 38.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for albu_spec-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 826a0c611f50cefaa04a5c4326d69f0b90e0c5bf8d44939491d852f390253381
MD5 520552b3a05a9edf04dc79f659187e03
BLAKE2b-256 509f457c27742c6caeda861b7d2a3378f90cfe9c6d8c7e456d5e607cbcdd3728

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page