Skip to main content

🛰️ Process raster data in python

Project description

Article DOI:10.1038/s41598-023-47595-7 GitHub release (latest SemVer including pre-releases) PyPI PyPI - Python Version PyPI - License docs

Logo georeader

georeader is a Python package for processing raster data from different satellite missions. It provides a unified interface for reading, manipulating, and saving geospatial raster data with a focus on machine learning workflows.

georeader is mainly used to process satellite data for scientific usage, to create ML-ready datasets and to implement end-to-end operational inference pipelines (e.g. the Kherson Dam Break floodmap). See georeader concepts and protocols for basic concepts and API.

Install

Requirements: Python ≥3.11

pip install georeader-spaceml

Optional dependencies for specific readers:

# For cloud storage access (GCS, S3, Azure)
pip install georeader-spaceml fsspec gcsfs s3fs adlfs

# For hyperspectral sensors (EMIT, PRISMA, EnMAP)
pip install georeader-spaceml h5py xarray h5netcdf

# For Google Earth Engine integration
pip install georeader-spaceml earthengine-api

Quick Start

Read a Sentinel-2 image from cloud storage

Read from a Sentinel-2 image a fixed size subimage on an specific lon,lat location:

from georeader.rasterio_reader import RasterioReader
from georeader import read

# S2 image from WorldFloodsv2 dataset
s2url = "https://huggingface.co/datasets/isp-uv-es/WorldFloodsv2/resolve/main/test/S2/EMSR264_18MIANDRIVAZODETAIL_DEL_v2.tif"
rst = RasterioReader(s2url)

# lazy loading bands
rst_rgb = rst.isel({"band": [3, 2, 1]}) # 1-based list as in rasterio

cords_read = (45.43, -19.53) # long, lat
crs_cords = "EPSG:4326"

# See also read.read_from_bounds, read.read_from_polygon for different ways of croping an image
data = read.read_from_center_coords(rst_rgb,
                                    cords_read, shape=(504, 1040),
                                    crs_center_coords=crs_cords)

data_memory = data.load() # this loads the data to memory

data_memory # GeoTensor object
>>  Transform: | 10.00, 0.00, 539910.00|
| 0.00,-10.00, 7842990.00|
| 0.00, 0.00, 1.00|
         Shape: (3, 504, 1040)
         Resolution: (10.0, 10.0)
         Bounds: (539910.0, 7837950.0, 550310.0, 7842990.0)
         CRS: EPSG:32738
         fill_value_default: 0
from georeader import plot
plot.show((data_memory / 3_500).clip(0, 1))
awesome georeader

Saving the GeoTensor as a COG GeoTIFF:

from georeader.save import save_cog

# Supports writing in remote location (e.g. gs://bucket-name/s2_crop.tif)
save_cog(data_memory, "s2_crop.tif", descriptions=["B4","B3", "B2"])

Align images from different sensors

from georeader import read

# Load two images from different sensors
s2_data = read.read_from_tif("sentinel2.tif")
aviris_data = read.read_from_tif("aviris.tif")

# Reproject AVIRIS to match Sentinel-2 grid
aviris_aligned = read.read_reproject(
    aviris_data, 
    dst_crs=s2_data.crs,
    dst_transform=s2_data.transform,
    dst_shape=s2_data.shape[-2:]
)

Core Concepts

GeoTensor

The central data structure is GeoTensor - a numpy array with geospatial metadata:

from georeader.geotensor import GeoTensor

gt = GeoTensor(
    values=np_array,           # Shape: (C, H, W) or (H, W)
    transform=affine_transform, # Maps pixel to geographic coordinates
    crs="EPSG:32613"           # Coordinate Reference System
)

# Access properties
gt.bounds      # (xmin, ymin, xmax, ymax)
gt.res         # (x_res, y_res)
gt.footprint() # Shapely polygon of extent

Reader Protocol

All readers implement the GeoData protocol, providing a consistent interface:

# Any reader works with the same read functions
from georeader import read

data = read.read_from_bounds(reader, bounds, crs_bounds="EPSG:4326")
data = read.read_from_polygon(reader, polygon)
data = read.read_from_center_coords(reader, coords, shape=(512, 512))

Documentation

📚 Full documentation: spaceml-org.github.io/georeader

georeader makes easy to read specific areas of your image, to reproject images from different satellites to a common grid (georeader.read), to go from vector to raster formats (georeader.vectorize and georeader.rasterize) or to do radiance to reflectance conversions (georeader.reflectance).

georeader is mainly used to process satellite data for scientific usage, to create ML-ready datasets and to implement end-to-end operational inference pipelines (e.g. the Kherson Dam Break floodmap).

Tutorials

Sentinel-2

Read rasters from different satellites

Used in other projects

Citation

If you find this code useful please cite:

@article{portales-julia_global_2023,
	title = {Global flood extent segmentation in optical satellite images},
	volume = {13},
	issn = {2045-2322},
	doi = {10.1038/s41598-023-47595-7},
	number = {1},
	urldate = {2023-11-30},
	journal = {Scientific Reports},
	author = {Portalés-Julià, Enrique and Mateo-García, Gonzalo and Purcell, Cormac and Gómez-Chova, Luis},
	month = nov,
	year = {2023},
	pages = {20316},
}
@article{ruzicka_starcop_2023,
	title = {Semantic segmentation of methane plumes with hyperspectral machine learning models},
	volume = {13},
	issn = {2045-2322},
	url = {https://www.nature.com/articles/s41598-023-44918-6},
	doi = {10.1038/s41598-023-44918-6},
	number = {1},
	journal = {Scientific Reports},
	author = {Růžička, Vít and Mateo-Garcia, Gonzalo and Gómez-Chova, Luis and Vaughan, Anna, and Guanter, Luis and Markham, Andrew},
	month = nov,
	year = {2023},
	pages = {19999},
}

Acknowledgments

This research has been supported by the DEEPCLOUD project (PID2019-109026RB-I00) funded by the Spanish Ministry of Science and Innovation (MCIN/AEI/10.13039/501100011033) and the European Union (NextGenerationEU).

DEEPCLOUD project (PID2019-109026RB-I00, University of Valencia) funded by MCIN/AEI/10.13039/501100011033.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

georeader_spaceml-2.0.0.tar.gz (254.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

georeader_spaceml-2.0.0-py3-none-any.whl (270.3 kB view details)

Uploaded Python 3

File details

Details for the file georeader_spaceml-2.0.0.tar.gz.

File metadata

  • Download URL: georeader_spaceml-2.0.0.tar.gz
  • Upload date:
  • Size: 254.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.11.15 Linux/6.17.0-1010-azure

File hashes

Hashes for georeader_spaceml-2.0.0.tar.gz
Algorithm Hash digest
SHA256 93e51684b25e3568698e1202c476d86b50ef973a678545d9be4d2554683d2987
MD5 6d334249bc746436e7fbdb26900a72bf
BLAKE2b-256 fb5f5c8f139539ab80814514abb58d48a4d45bc8fc7b982f1462fc1b46c1ac5c

See more details on using hashes here.

File details

Details for the file georeader_spaceml-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: georeader_spaceml-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 270.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.11.15 Linux/6.17.0-1010-azure

File hashes

Hashes for georeader_spaceml-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3e558b0521bb62e00b6f7a478016b5496d20fead517fe3fb004aee0cf32ae396
MD5 2848726e3a27fcc67b6a0a0a18c3caa2
BLAKE2b-256 3b51c4221d214440c504bb313ee0cbef456453363d7fd8d7fe4c1c3972f0dff1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page