Skip to main content

{{project_description}}

Project description

Resolution Aware Transformer

A PyTorch implementation of a resolution-aware transformer designed for multi-scale image analysis, particularly suited for microscopy and medical imaging applications.

GitHub - License CI/CD Pipeline codecov PyPI - Version PyPI - Python Version

Overview

The Resolution Aware Transformer is a PyTorch neural network module that processes multi-scale images to produce embeddings informed by both global context and local details. It combines:

  • Spatial Grouping Attention (SGA): Groups spatially-related features for efficient attention computation
  • Rotary Spatial Embeddings (RoSE): Provides spatial awareness through rotary position encoding
  • Multi-resolution Processing: Handles image pyramids with different resolutions seamlessly

This architecture is particularly effective for:

  • Microscopy imaging: Processing high-resolution biological images with multiple scales
  • Medical imaging: Analyzing medical scans with varying levels of detail
  • Computer vision tasks: Any application requiring both global context and fine-grained spatial details

Features

  • Multi-scale Processing: Handles single images or image pyramids with different resolutions
  • Spatial Awareness: Uses Rotary Spatial Embeddings (RoSE) for position-aware attention
  • Flexible Attention: Choice between dense and sparse spatial grouping attention
  • Real-world Coordinates: Supports physical pixel spacing for medical/scientific imaging
  • Masking Support: Handle irregular regions or missing data with optional masks
  • 2D and 3D Support: Works with both 2D images and 3D volumes
  • GPU Optimized: Built on PyTorch with efficient attention mechanisms

Key Parameters

Parameter Description Default
spatial_dims Number of spatial dimensions (2 or 3) 2
input_features Number of input channels 3
feature_dims Embedding dimension 128
num_blocks Number of transformer blocks 4
sga_attention_type "dense" or "sparse" attention "dense"
num_heads Number of attention heads 16
kernel_size Convolution kernel size for downsampling 7
mlp_ratio MLP hidden dimension ratio 4
learnable_rose Use learnable rotary embeddings True
rose_initial_scaling Initial scaling transformation mode: "log" (RoSE default), "rope" (standard RoPE), "identity"/"linear"/"power" (other variants), or None "log"

Installation

From PyPI

pip install resolution-aware-transformer

From source

pip install git+https://github.com/rhoadesScholar/resolution-aware-transformer.git

Requirements

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Setup

# Clone the repository
git clone https://github.com/rhoadesScholar/resolution-aware-transformer.git
cd resolution-aware-transformer

# Install in development mode with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run linting
flake8 .
black .
isort .

Usage

Basic Usage

import torch
from resolution_aware_transformer import ResolutionAwareTransformer

# Initialize the model
model = ResolutionAwareTransformer(
    spatial_dims=2,          # 2D images (use 3 for 3D volumes)
    input_features=3,        # RGB images (adjust for your data)
    feature_dims=128,        # Embedding dimension
    num_blocks=4,            # Number of transformer blocks
    num_heads=16,            # Attention heads
    sga_attention_type="dense"   # "dense" or "sparse"
)

# Single image input
image = torch.randn(1, 3, 256, 256)  # [batch, channels, height, width]
output = model(image)

# Multi-scale image pyramid input
image_pyramid = [
    torch.randn(1, 3, 256, 256),  # High resolution
    torch.randn(1, 3, 128, 128),  # Medium resolution
    torch.randn(1, 3, 64, 64)     # Low resolution
]
output = model(image_pyramid)

Advanced Usage with Spacing and Masks

# For medical/microscopy images with known pixel spacing
spacing = [0.5, 0.5]  # μm per pixel in x, y dimensions

# Optional masks for irregular regions
mask = torch.ones(1, 256, 256)  # Valid regions = 1, invalid = 0

output = model(
    image,
    input_spacing=spacing,
    mask=mask
)

# Each output contains embeddings and attention maps
for scale_output in output:
    embeddings = scale_output['x_out']        # [batch, num_patches, feature_dims]
    spacing_info = scale_output['out_spacing'] # Pixel spacing for this scale
    grid_shape = scale_output['out_grid_shape'] # Spatial dimensions

3D Volume Processing

# For 3D medical volumes or microscopy stacks
model_3d = ResolutionAwareTransformer(
    spatial_dims=3,
    input_features=1,        # Grayscale volumes
    feature_dims=256,
    num_blocks=6
)

volume = torch.randn(1, 1, 64, 64, 64)  # [batch, channels, depth, height, width]
output = model_3d(volume)

License

BSD 3-Clause License. See LICENSE for details.

Citation

If you use this software in your research, please cite it using the information in CITATION.cff or use the following BibTeX:

@software{rhoades_resolution_aware_transformer,
  author = {Rhoades, Jeff},
  title = {Resolution Aware Transformer: A PyTorch implementation of a resolution-aware transformer for multi-scale image analysis},
  url = {https://github.com/rhoadesScholar/resolution-aware-transformer},
  version = {2025.8.19.420},
  year = {2025}
}

Acknowledgments

This implementation builds upon research in spatial attention mechanisms and transformer architectures for computer vision. Special thanks to the PyTorch community and contributors to the spatial-grouping-attention and rotary-spatial-embeddings packages.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

resolution_aware_transformer-2025.9.29.2036.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file resolution_aware_transformer-2025.9.29.2036.tar.gz.

File metadata

File hashes

Hashes for resolution_aware_transformer-2025.9.29.2036.tar.gz
Algorithm Hash digest
SHA256 f3fdf7672bb5531a93f8e1d110f5518141cc156272ff1916cb2c7f9a82132024
MD5 6f7b912c4acd84040d136a9c58b0b1d5
BLAKE2b-256 3775329342e0c33e3b8433f1e3023c84e07d6593998c8d76150eed1f75018a0b

See more details on using hashes here.

Provenance

The following attestation bundles were made for resolution_aware_transformer-2025.9.29.2036.tar.gz:

Publisher: ci-cd.yml on rhoadesScholar/resolution-aware-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file resolution_aware_transformer-2025.9.29.2036-py3-none-any.whl.

File metadata

File hashes

Hashes for resolution_aware_transformer-2025.9.29.2036-py3-none-any.whl
Algorithm Hash digest
SHA256 e5a87c555ae47a4a83b15116190b28795a1be067ba79648ac7ea75f813e97b63
MD5 898281306cf2f829a6feacfd6efd904c
BLAKE2b-256 a08b98301bc7872d305c354eb5a81ae54c40d3a01161e8a2758331404baba528

See more details on using hashes here.

Provenance

The following attestation bundles were made for resolution_aware_transformer-2025.9.29.2036-py3-none-any.whl:

Publisher: ci-cd.yml on rhoadesScholar/resolution-aware-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page