Skip to main content

Deep Learning for Earth Observation - Data Preparation Pipeline

Project description

dl4eo

dl4eo is a Python package designed to streamline the end-to-end pipeline for generating multi-source remote sensing dataset for deep learning applications. This pipeline primarily combine dataset of Sentinel-1, Sentinel-2 and Copernicus DEM. It automates downloading, preprocessing, DEM/SAR integration, and mask/normalization steps using Sentinel-1, Sentinel-2, and elevation data producing stackable chips for training robust Earth observation models.The Final dataset will contains 11 bands inclusing seven bands from sentinel2, Elevation, slope from Copernicus DEM, RTC (Radiometrically terrain corrected)VV and VH layers from Sentinel-1.


📦 Installation

Install directly from PyPI:

pip install dl4eo

Or install locally in development mode:

git clone https://github.com/your-username/dl4eo.git
cd dl4eo
pip install -e .

🚀 Quick Start

import dl4eo

dl4eo.generate_dataset(
    base_dir="/your/output_directory",
    aoi_shapefile_dir="/path/to/aoi/folder",  # folder containing one or more AOI shapefiles
    feature_shapefile="/path/to/lakes.shp",   # shapefile used for AOI box creation & visualization
    date_range="2020-08-01/2020-08-30",
    box_size_m=1280,                          # image chip extent (default = 2560)
    cloud_cover=15                            # optional: set max cloud cover % (default = 20)
)

🧠 What It Does

The pipeline consists of the following automated stages:

  1. Download Sentinel-2 imagery via STAC API (cloud-cover filtered)
  2. Preprocess S2: resampling and band stacking
  3. AOI Box Generation: intersects lakes/AOI to create image chips
  4. DEM Integration: clips and resamples elevation data (e.g., SRTM or TanDEM-X)
  5. Download Sentinel-1 (SAR) from ASF, clips and stacks VV/VH
  6. Mask Generation using the provided lake shapefile
  7. Data Normalization across the full stack

📂 Input Requirements

  • aoi_shapefile_dir: Folder containing one or more AOI .shp files
  • feature_shapefile: A shapefile representing features (e.g., lakes) to extract training samples from
  • Valid date range in the format: "YYYY-MM-DD/YYYY-MM-DD"

🧰 Dependencies

Installed automatically:

  • rasterio, geopandas, shapely, matplotlib
  • pystac-client, fiona, requests, numpy, joblib

🗃 Output Structure

output_dir/
├── images/              # Raw Sentinel-2
├── Resampled/           # 10m resampled bands
├── stack/               # Stacked Sentinel-2 bands
├── DEM/                 # Elevation data
├── GRD/                 # Raw Sentinel-1 GRD
├── GRD_Extracted/       # Extracted by bounding box
├── Clipped_SAR/         # Matched SAR chips
├── stacked_with_sar/    # Combined S2 + SAR + DEM
├── mask/                # Binary masks (from lakes)
├── normalize/           # Final normalized image chips
├── AOI_boxes/           # AOI boxes (GeoJSONs)
└── shapefile/each/      # Individual AOI shapefiles

🧪 Example Use Cases

  • Glacial lake mapping and segmentation
  • Flood extent extraction
  • Multimodal image fusion (S2+S1+DEM)
  • Chip-based data generation for training transformers and GANs

🧑‍💻 Author

Developed by Saurabh Kaushik,
Postdoctoral Researcher @ University of Arizona
Earth Observation, Deep Learning, Geo-Foundational Models, Cryosphere


📜 License

MIT License


?? Citation

If you use dl4eo in your research or publications, please cite it as: Kaushik, S. (2025). dl4eo: A Python package for multi-source remote sensing data preparation for deep learning. Python Package Index. https://pypi.org/project/dl4eo/

BibTeX:

@misc{kaushik2025dl4eo,
  author       = {Saurabh Kaushik},
  title        = {{dl4eo: A Python package for multi-source remote sensing data preparation for deep learning}},
  year         = {2025},
  howpublished = {\url{https://pypi.org/project/dl4eo/}},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dl4eo-0.2.3.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dl4eo-0.2.3-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file dl4eo-0.2.3.tar.gz.

File metadata

  • Download URL: dl4eo-0.2.3.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for dl4eo-0.2.3.tar.gz
Algorithm Hash digest
SHA256 ca7cf23641690a9623b99a8bbb19bf4f7d35675d33f907f03498cab16ad5c652
MD5 5ade82b3cde37e44cb5a3197ced3e7db
BLAKE2b-256 fa0617fa95fefcc166d6f78ce970bd4bb0c90d9eb98f01824828aeda6145c3f0

See more details on using hashes here.

File details

Details for the file dl4eo-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: dl4eo-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for dl4eo-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4ec8134752fa79f4a020faaa6f307fa1c9b4ca28c0bb79c8a9e286a4676a1601
MD5 455112e7c5ace2a7bf49093498ac1efa
BLAKE2b-256 dabe130249209248dc4a6433033e03a9a793edabbc86b7d1b9e8b3194e90d0b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page