{{project_description}}
Project description
Resolution Aware Transformer
A PyTorch implementation of a resolution-aware transformer designed for multi-scale image analysis, particularly suited for microscopy and medical imaging applications.
Overview
The Resolution Aware Transformer is a PyTorch neural network module that processes multi-scale images to produce embeddings informed by both global context and local details. It combines:
- Spatial Grouping Attention (SGA): Groups spatially-related features for efficient attention computation
- Rotary Spatial Embeddings (RoSE): Provides spatial awareness through rotary position encoding
- Multi-resolution Processing: Handles image pyramids with different resolutions seamlessly
This architecture is particularly effective for:
- Microscopy imaging: Processing high-resolution biological images with multiple scales
- Medical imaging: Analyzing medical scans with varying levels of detail
- Computer vision tasks: Any application requiring both global context and fine-grained spatial details
Features
- Multi-scale Processing: Handles single images or image pyramids with different resolutions
- Spatial Awareness: Uses Rotary Spatial Embeddings (RoSE) for position-aware attention
- Flexible Attention: Choice between dense and sparse spatial grouping attention
- Real-world Coordinates: Supports physical pixel spacing for medical/scientific imaging
- Masking Support: Handle irregular regions or missing data with optional masks
- 2D and 3D Support: Works with both 2D images and 3D volumes
- GPU Optimized: Built on PyTorch with efficient attention mechanisms
Key Parameters
| Parameter | Description | Default |
|---|---|---|
spatial_dims |
Number of spatial dimensions (2 or 3) | 2 |
input_features |
Number of input channels | 3 |
feature_dims |
Embedding dimension | 128 |
num_blocks |
Number of transformer blocks | 4 |
sga_attention_type |
"dense" or "sparse" attention | "dense" |
num_heads |
Number of attention heads | 16 |
kernel_size |
Convolution kernel size for downsampling | 7 |
mlp_ratio |
MLP hidden dimension ratio | 4 |
learnable_rose |
Use learnable rotary embeddings | True |
rose_initial_scaling |
Initial scaling transformation mode: "log" (RoSE default), "rope" (standard RoPE), "identity"/"linear"/"power" (other variants), or None | "log" |
Installation
From PyPI
pip install resolution-aware-transformer
From source
pip install git+https://github.com/rhoadesScholar/resolution-aware-transformer.git
Requirements
- Python ≥ 3.10
- PyTorch
- spatial-grouping-attention
- rotary-spatial-embeddings
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
Development Setup
# Clone the repository
git clone https://github.com/rhoadesScholar/resolution-aware-transformer.git
cd resolution-aware-transformer
# Install in development mode with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
flake8 .
black .
isort .
Usage
Basic Usage
import torch
from resolution_aware_transformer import ResolutionAwareTransformer
# Initialize the model
model = ResolutionAwareTransformer(
spatial_dims=2, # 2D images (use 3 for 3D volumes)
input_features=3, # RGB images (adjust for your data)
feature_dims=128, # Embedding dimension
num_blocks=4, # Number of transformer blocks
num_heads=16, # Attention heads
sga_attention_type="dense" # "dense" or "sparse"
)
# Single image input
image = torch.randn(1, 3, 256, 256) # [batch, channels, height, width]
output = model(image)
# Multi-scale image pyramid input
image_pyramid = [
torch.randn(1, 3, 256, 256), # High resolution
torch.randn(1, 3, 128, 128), # Medium resolution
torch.randn(1, 3, 64, 64) # Low resolution
]
output = model(image_pyramid)
Advanced Usage with Spacing and Masks
# For medical/microscopy images with known pixel spacing
spacing = [0.5, 0.5] # μm per pixel in x, y dimensions
# Optional masks for irregular regions
mask = torch.ones(1, 256, 256) # Valid regions = 1, invalid = 0
output = model(
image,
input_spacing=spacing,
mask=mask
)
# Each output contains embeddings and attention maps
for scale_output in output:
embeddings = scale_output['x_out'] # [batch, num_patches, feature_dims]
spacing_info = scale_output['out_spacing'] # Pixel spacing for this scale
grid_shape = scale_output['out_grid_shape'] # Spatial dimensions
3D Volume Processing
# For 3D medical volumes or microscopy stacks
model_3d = ResolutionAwareTransformer(
spatial_dims=3,
input_features=1, # Grayscale volumes
feature_dims=256,
num_blocks=6
)
volume = torch.randn(1, 1, 64, 64, 64) # [batch, channels, depth, height, width]
output = model_3d(volume)
License
BSD 3-Clause License. See LICENSE for details.
Citation
If you use this software in your research, please cite it using the information in CITATION.cff or use the following BibTeX:
@software{rhoades_resolution_aware_transformer,
author = {Rhoades, Jeff},
title = {Resolution Aware Transformer: A PyTorch implementation of a resolution-aware transformer for multi-scale image analysis},
url = {https://github.com/rhoadesScholar/resolution-aware-transformer},
version = {2025.8.19.420},
year = {2025}
}
Acknowledgments
This implementation builds upon research in spatial attention mechanisms and transformer architectures for computer vision. Special thanks to the PyTorch community and contributors to the spatial-grouping-attention and rotary-spatial-embeddings packages.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file resolution_aware_transformer-2025.9.29.2036.tar.gz.
File metadata
- Download URL: resolution_aware_transformer-2025.9.29.2036.tar.gz
- Upload date:
- Size: 15.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3fdf7672bb5531a93f8e1d110f5518141cc156272ff1916cb2c7f9a82132024
|
|
| MD5 |
6f7b912c4acd84040d136a9c58b0b1d5
|
|
| BLAKE2b-256 |
3775329342e0c33e3b8433f1e3023c84e07d6593998c8d76150eed1f75018a0b
|
Provenance
The following attestation bundles were made for resolution_aware_transformer-2025.9.29.2036.tar.gz:
Publisher:
ci-cd.yml on rhoadesScholar/resolution-aware-transformer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
resolution_aware_transformer-2025.9.29.2036.tar.gz -
Subject digest:
f3fdf7672bb5531a93f8e1d110f5518141cc156272ff1916cb2c7f9a82132024 - Sigstore transparency entry: 569736275
- Sigstore integration time:
-
Permalink:
rhoadesScholar/resolution-aware-transformer@5add488bc411051d81c29f030071e85d2a53d8d6 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/rhoadesScholar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci-cd.yml@5add488bc411051d81c29f030071e85d2a53d8d6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file resolution_aware_transformer-2025.9.29.2036-py3-none-any.whl.
File metadata
- Download URL: resolution_aware_transformer-2025.9.29.2036-py3-none-any.whl
- Upload date:
- Size: 10.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e5a87c555ae47a4a83b15116190b28795a1be067ba79648ac7ea75f813e97b63
|
|
| MD5 |
898281306cf2f829a6feacfd6efd904c
|
|
| BLAKE2b-256 |
a08b98301bc7872d305c354eb5a81ae54c40d3a01161e8a2758331404baba528
|
Provenance
The following attestation bundles were made for resolution_aware_transformer-2025.9.29.2036-py3-none-any.whl:
Publisher:
ci-cd.yml on rhoadesScholar/resolution-aware-transformer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
resolution_aware_transformer-2025.9.29.2036-py3-none-any.whl -
Subject digest:
e5a87c555ae47a4a83b15116190b28795a1be067ba79648ac7ea75f813e97b63 - Sigstore transparency entry: 569736277
- Sigstore integration time:
-
Permalink:
rhoadesScholar/resolution-aware-transformer@5add488bc411051d81c29f030071e85d2a53d8d6 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/rhoadesScholar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci-cd.yml@5add488bc411051d81c29f030071e85d2a53d8d6 -
Trigger Event:
push
-
Statement type: