Tool for creating patches from geo-referenced and non geo-referenced image and label pairs
Project description
GeoTIFF Tiler
A Python package for creating training patches from geospatial imagery and label pairs for machine learning applications.
Overview
GeoTIFF Tiler is designed to streamline the creation of training data patches from geo-referenced and non-geo-referenced image and label pairs. It helps prepare data for machine learning models requiring consistent input dimensions, particularly for geospatial applications.
Features
- Create patches of specified size from image-label pairs
- Support for various input formats:
- Images: GeoTIFFs (geo-referenced and non-geo-referenced), STAC imagery
- Labels: GeoTIFFs (geo-referenced and non-geo-referenced), vector data (.geojson, .gpkg, .shp)
- Intelligent patch filtering based on label content
- Padding for edge patches to maintain consistent dimensions
- Automatic handling of CRS and alignment issues
- Output in Zarr format for efficient storage and access
- Visualization tools for quality assessment
Installation
pip install geotiff-tiler
Quick Start
from geotiff_tiler.tiler import Tiler
# Define your image-label pairs with metadata
data = [{
"image": "./path/to/image.tif",
"label": "./path/to/label.tif",
"metadata": {"collection": "satellite-name", "gsd": 0.5}
}]
# Initialize the tiler with your configuration
tiler = Tiler(
input_dict=data,
patch_size=(256, 256), # Height, Width
attr_field=["class", "category"], # Field(s) in vector data to use for labels
attr_values=[1, 2, 3], # Values to extract from the fields
stride=128, # Overlap between patches
discard_empty=True, # Skip patches with no labels
label_threshold=0.05, # Minimum non-zero label coverage
output_dir='./output/patches'
)
# Create the patches
tiler.create_tiles()
Using STAC Items
The library supports STAC (SpatioTemporal Asset Catalog) items, making it compatible with cloud-native geospatial workflows.
Parameters
- input_dict: List of dictionaries with "image", "label", and "metadata" keys
- patch_size: Tuple of (height, width) for the output patches
- attr_field: Field name(s) in vector data to use for labeling (list of strings)
- attr_values: Values to extract from the attribute field (list of strings or numbers)
- stride: Spacing between patches (determines overlap); if None, uses max(patch_size)
- discard_empty: Whether to skip patches with no labels
- label_threshold: Minimum fraction of non-zero pixels required in a label patch
- output_dir: Directory to save the output patches
Output Format
Patches are saved in Zarr format with the following structure:
images: Array of image patches [N, C, H, W]labels: Array of label patches [N, H, W]positions: Array of patch locations [N, 2]metadata: Dictionary with additional information
A csv file is created of the zarr paths.
License
MIT License
Author
Victor Alhassan (victor.alhassan@nrcan-rncan.gc.ca)
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file geotiff_tiler-0.1.6.tar.gz.
File metadata
- Download URL: geotiff_tiler-0.1.6.tar.gz
- Upload date:
- Size: 21.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6871a7ab7025748d78a8931b254b19e8d4e0167bf960a62d80a2d063b82754a5
|
|
| MD5 |
570cc6ea34a4eb0859b832fecb14d08a
|
|
| BLAKE2b-256 |
a4e7f5c0e406e09627fd85b62abaa885ca1be8e8109b46bfd359af48b3b5a407
|
File details
Details for the file geotiff_tiler-0.1.6-py3-none-any.whl.
File metadata
- Download URL: geotiff_tiler-0.1.6-py3-none-any.whl
- Upload date:
- Size: 22.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd203dc03a607f00c6dcde9fc8794eef2bbe83b12e2a94282931ac0746bece84
|
|
| MD5 |
e82685b86fdf65c09376f01726027831
|
|
| BLAKE2b-256 |
6307e8d4a63957b1132316e1c060a12a9e8c4deab294ed8864f9e017ff11cf68
|