Skip to main content

Deep Learning for Earth Observation - Data Preparation Pipeline

Project description

dl4eo

dl4eo is a Python package designed to streamline the end-to-end pipeline for generating multi-source remote sensing dataset for deep learning applications. This pipeline primarily combine dataset of Sentinel-1, Sentinel-2 and Copernicus DEM. It automates downloading, preprocessing, DEM/SAR integration, and mask/normalization steps using Sentinel-1, Sentinel-2, and elevation data producing stackable chips for training robust Earth observation models.The Final dataset will contains 11 bands inclusing seven bands from sentinel2, Elevation, slope from Copernicus DEM, RTC (Radiometrically terrain corrected)VV and VH layers from Sentinel-1.


📦 Installation

Install directly from PyPI:

pip install dl4eo

Or install locally in development mode:

git clone https://github.com/your-username/dl4eo.git
cd dl4eo
pip install -e .

🚀 Quick Start

import dl4eo

dl4eo.generate_dataset(
    base_dir="/your/output_directory",
    aoi_shapefile_dir="/path/to/aoi/folder",  # folder containing one or more AOI shapefiles
    feature_shapefile="/path/to/lakes.shp",   # shapefile used for AOI box creation & visualization
    date_range="2020-08-01/2020-08-30",
    box_size_m=1280,                          # image chip extent (default = 2560)
    cloud_cover=15                            # optional: set max cloud cover % (default = 20)
)

🧠 What It Does

The pipeline consists of the following automated stages:

  1. Download Sentinel-2 imagery via STAC API (cloud-cover filtered)
  2. Preprocess S2: resampling and band stacking
  3. AOI Box Generation: intersects lakes/AOI to create image chips
  4. DEM Integration: clips and resamples elevation data (e.g., SRTM or TanDEM-X)
  5. Download Sentinel-1 (SAR) from ASF, clips and stacks VV/VH
  6. Mask Generation using the provided lake shapefile
  7. Data Normalization across the full stack

📂 Input Requirements

  • aoi_shapefile_dir: Folder containing one or more AOI .shp files
  • feature_shapefile: A shapefile representing features (e.g., lakes) to extract training samples from
  • Valid date range in the format: "YYYY-MM-DD/YYYY-MM-DD"

🧰 Dependencies

Installed automatically:

  • rasterio, geopandas, shapely, matplotlib
  • pystac-client, fiona, requests, numpy, joblib

🗃 Output Structure

output_dir/
├── images/              # Raw Sentinel-2
├── Resampled/           # 10m resampled bands
├── stack/               # Stacked Sentinel-2 bands
├── DEM/                 # Elevation data
├── GRD/                 # Raw Sentinel-1 GRD
├── GRD_Extracted/       # Extracted by bounding box
├── Clipped_SAR/         # Matched SAR chips
├── stacked_with_sar/    # Combined S2 + SAR + DEM
├── mask/                # Binary masks (from lakes)
├── normalize/           # Final normalized image chips
├── AOI_boxes/           # AOI boxes (GeoJSONs)
└── shapefile/each/      # Individual AOI shapefiles

🧪 Example Use Cases

  • Glacial lake mapping and segmentation
  • Flood extent extraction
  • Multimodal image fusion (S2+S1+DEM)
  • Chip-based data generation for training transformers and GANs

🧑‍💻 Author

Developed by Saurabh Kaushik,
Postdoctoral Researcher @ University of Arizona
Earth Observation, Deep Learning, Geo-Foundational Models, Cryosphere


📜 License

MIT License


?? Citation

If you use dl4eo in your research or publications, please cite it as: Kaushik, S. (2025). dl4eo: A Python package for multi-source remote sensing data preparation for deep learning. Python Package Index. https://pypi.org/project/dl4eo/

BibTeX:

@misc{kaushik2025dl4eo,
  author       = {Saurabh Kaushik},
  title        = {{dl4eo: A Python package for multi-source remote sensing data preparation for deep learning}},
  year         = {2025},
  howpublished = {\url{https://pypi.org/project/dl4eo/}},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dl4eo-0.2.7.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dl4eo-0.2.7-py3-none-any.whl (22.1 kB view details)

Uploaded Python 3

File details

Details for the file dl4eo-0.2.7.tar.gz.

File metadata

  • Download URL: dl4eo-0.2.7.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for dl4eo-0.2.7.tar.gz
Algorithm Hash digest
SHA256 767834aaeecf0610dbb54160d4b77f1d915b35797bdacc395718327d62baa798
MD5 8754e5a2d864ea04738ed598817b1893
BLAKE2b-256 7b09fab32147e0768442ce602ea16e9994825ab581768827f6c5b2bf16356001

See more details on using hashes here.

File details

Details for the file dl4eo-0.2.7-py3-none-any.whl.

File metadata

  • Download URL: dl4eo-0.2.7-py3-none-any.whl
  • Upload date:
  • Size: 22.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for dl4eo-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 d74d685bca9b7d9f0386a86f4759e57e2809ee7e4d0a4ad2364192099c8bcb9c
MD5 ae2cebd78cf83dd960b6f470c4df9c25
BLAKE2b-256 4f616296805ee32af647a9c996868afe64237345dbffcb3668addfffbdc6e81f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page