Skip to main content

Deep Learning for Earth Observation - Data Preparation Pipeline

Project description

dl4eo

dl4eo is a Python package designed to streamline the end-to-end pipeline for generating multi-source remote sensing dataset for deep learning applications. This pipeline primarily combine dataset of Sentinel-1, Sentinel-2 and Copernicus DEM. It automates downloading, preprocessing, DEM/SAR integration, and mask/normalization steps using Sentinel-1, Sentinel-2, and elevation data producing stackable chips for training robust Earth observation models.The Final dataset will contains 11 bands inclusing seven bands from sentinel2, Elevation, slope from Copernicus DEM, RTC (Radiometrically terrain corrected)VV and VH layers from Sentinel-1.


📦 Installation

Install directly from PyPI:

pip install dl4eo

Or install locally in development mode:

git clone https://github.com/your-username/dl4eo.git
cd dl4eo
pip install -e .

🚀 Quick Start

import dl4eo

dl4eo.generate_dataset(
    base_dir="/your/output_directory",
    aoi_shapefile_dir="/path/to/aoi/folder",  # folder containing one or more AOI shapefiles
    feature_shapefile="/path/to/lakes.shp",   # shapefile used for AOI box creation & visualization
    date_range="2020-08-01/2020-08-30",
    box_size_m=1280,                          # image chip extent (default = 2560)
    cloud_cover=15                            # optional: set max cloud cover % (default = 20)
)

🧠 What It Does

The pipeline consists of the following automated stages:

  1. Download Sentinel-2 imagery via STAC API (cloud-cover filtered)
  2. Preprocess S2: resampling and band stacking
  3. AOI Box Generation: intersects lakes/AOI to create image chips
  4. DEM Integration: clips and resamples elevation data (e.g., SRTM or TanDEM-X)
  5. Download Sentinel-1 (SAR) from ASF, clips and stacks VV/VH
  6. Mask Generation using the provided lake shapefile
  7. Data Normalization across the full stack

📂 Input Requirements

  • aoi_shapefile_dir: Folder containing one or more AOI .shp files
  • feature_shapefile: A shapefile representing features (e.g., lakes) to extract training samples from
  • Valid date range in the format: "YYYY-MM-DD/YYYY-MM-DD"

🧰 Dependencies

Installed automatically:

  • rasterio, geopandas, shapely, matplotlib
  • pystac-client, fiona, requests, numpy, joblib

🗃 Output Structure

output_dir/
├── images/              # Raw Sentinel-2
├── Resampled/           # 10m resampled bands
├── stack/               # Stacked Sentinel-2 bands
├── DEM/                 # Elevation data
├── GRD/                 # Raw Sentinel-1 GRD
├── GRD_Extracted/       # Extracted by bounding box
├── Clipped_SAR/         # Matched SAR chips
├── stacked_with_sar/    # Combined S2 + SAR + DEM
├── mask/                # Binary masks (from lakes)
├── normalize/           # Final normalized image chips
├── AOI_boxes/           # AOI boxes (GeoJSONs)
└── shapefile/each/      # Individual AOI shapefiles

🧪 Example Use Cases

  • Glacial lake mapping and segmentation
  • Flood extent extraction
  • Multimodal image fusion (S2+S1+DEM)
  • Chip-based data generation for training transformers and GANs

🧑‍💻 Author

Developed by Saurabh Kaushik,
Postdoctoral Researcher @ University of Arizona
Earth Observation, Deep Learning, Geo-Foundational Models, Cryosphere


📜 License

MIT License


?? Citation

If you use dl4eo in your research or publications, please cite it as: Kaushik, S. (2025). dl4eo: A Python package for multi-source remote sensing data preparation for deep learning. Python Package Index. https://pypi.org/project/dl4eo/

BibTeX:

@misc{kaushik2025dl4eo,
  author       = {Saurabh Kaushik},
  title        = {{dl4eo: A Python package for multi-source remote sensing data preparation for deep learning}},
  year         = {2025},
  howpublished = {\url{https://pypi.org/project/dl4eo/}},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dl4eo-0.2.0.tar.gz (19.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dl4eo-0.2.0-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file dl4eo-0.2.0.tar.gz.

File metadata

  • Download URL: dl4eo-0.2.0.tar.gz
  • Upload date:
  • Size: 19.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for dl4eo-0.2.0.tar.gz
Algorithm Hash digest
SHA256 4b4f317eb9c56ce234c1fcac761a88432c66920e0ffc17c1d3e94247fe0fd588
MD5 dc184d3f95f06a125f5c44a06e0f5b10
BLAKE2b-256 1969528dbf6f1419c0889534935fa83d73109c68c1449ce2bed76db66aaf5daf

See more details on using hashes here.

File details

Details for the file dl4eo-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: dl4eo-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for dl4eo-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6071dfcce1fed9a042ce0037983dd61a3d8783a482389cd6e777e39ed3f88811
MD5 33d76883278a93052592429e6bdb34e8
BLAKE2b-256 58902871159899e92bdbb375a553f330fdd93fc9081889005130e8caebe0b732

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page