Skip to main content

Remote Sensing Data-Fetcher and Data-Loader for Joint Classification of Hyperspectral and LiDAR Data

Project description

rs-fusion-datasets

PyPI - Version PyPI - Downloads PyPI - Python Version GitHub Created At GitHub License

rs-fusion-datasets is a remote sensing data-fetcher and data-loader for joint classification of hyperspectral, LiDAR and SAR data. rs-fusion-datasets is a Python package that:

  1. Automatically downloads and load many multimodal remote sensing datasets (houston, muufl, trento, berlin, augsburg, etc.)
  2. Provides ready-to-use torch dataloaders
  3. Provides utils for visulization, dataset spilit, benchmark, hsi to rgb, etc.

screenshot

[!IMPORTANT]

  1. version <=0.18.3 has a serious bug when you're using benchmarker.predicted_image(). This is fixed in the later versions.

Datasets

Dataset Fetcher Function Torch Dataset Bands Note
Houston 2013 fetch_houston2013 Houston2013 HSI,LiDAR
Trento fetch_trento Trento HSI,LiDAR
MUUFL fetch_muufl Muufl HSI,LiDAR
Houston 2018 fetch_houston2018_ouc Houston2018Ouc HSI,LiDAR
Augsburg fetch_augsburg_ouc AugsburgOuc HSI,SAR aka. MDAS
Berlin fetch_berlin_ouc BerlinOuc HSI,SAR

Quick Start

Install

pip install rs-fusion-datasets

Use with torch

from rs_fusion_datasets import Houston2013, Trento, Muufl, Houston2018Ouc, BerlinOuc, AugsburgOuc
dataset = Muufl('train', patch_size=11)
x_h, x_l, y, extras = dataset[0]

Get the raw image and labels

from rs_fusion_datasets import fetch_houston2013, fetch_muufl, fetch_trento, split_spmatrix
# For Houston 2013
hsi, dsm, train_label, test_label, info = fetch_houston2013()
# For Muufl and Trento
casi, lidar, truth, info = fetch_muufl()
train_label, test_label = split_spmatrix(truth, 20)
# For fetch_houston2018_ouc, fetch_augsberg_ouc, fetch_berlin_ouc
hsi, dsm, train_label, test_label, all_label, info = fetch_houston2018_ouc()

[!TIP] The labels returned are sparse matrix, you can either convert them to np.array easily by

train_label=train_label.todense()
test_label =test_label.todense()

Or directly use them for getting the value in a very fast way:

    def __getitem__(self, index):
      i = self.truth.row[index]
      j = self.truth.col[index]
      label = self.truth.data[index].item()
      x_hsi = self.hsi[:, i, j]
      x_dsm = self.dsm[:, i, j]
      return x_hsi, x_dsm, label

Utils

  1. <Dataset>.benchmarker: Draw the predicted labels, compute the confusion matrix, OA, AA, CA, Kappa. For the usage, see demo_torch.py
  2. <Dataset>.lbl2rgb: Convert the label of dataset to rgb image for visulization
  3. <Dataset>.hsi2rgb: Convert HSI to true color RGB
  4. read_roi: Read exported .txt file of ENVI roi to sparse matrix
  5. split_spmatrix: Split a sparse to get the train dataset and test dataset

Help

Star History

Star History Chart

Contribution

We welcome all contributions, including issues, pull requests, feature requests and discussions.

License

Copyright 2023-2025 songyz2019

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Credits

Augsburg:
The data is publicly available at 10.14459/2022mp1657312. If you use this data set, please cite our paper.
@article{hu2022mdas,
  title={MDAS: A New Multimodal Benchmark Dataset for Remote Sensing},
  author={Hu, Jingliang and Liu, Rong and Hong, Danfeng and Camero, Andr{\'e}s and Yao, Jing and Schneider, Mathias and Kurz, Franz and Segl, Karl and Zhu, Xiao Xiang},
  journal={Earth System Science Data Discussions},
  pages={1--26},
  year={2022},
  publisher={Copernicus GmbH},
  doi={10.5194/essd-2022-155}
}

Berlin:
Okujeni, A.; Van Der Linden, S.; Hostert, P. Berlin-Urban-Gradient dataset 2009—An EnMAP Preparatory Flight
 Campaign (Datasets); GFZ Data Services: Potsdam, Germany, 2016.

Houston2018: 
https://machinelearning.ee.uh.edu/2018-ieee-grss-data-fusion-challenge-fusion-of-multispectral-lidar-and-hyperspectral-data/
The dataset can be downloaded here subject to the terms and conditions listed below. If you wish to use the data, please be sure to email us and provide your Name, Contact information, affiliation (University, research lab etc.), and an acknowledgement that you will cite this dataset and its source appropriately, as well as provide an acknowledgement to the IEEE GRSS IADF and the Hyperspectral Image Analysis Lab at the University of Houston, in any manuscript(s) resulting from it.

Houston2013: 
https://machinelearning.ee.uh.edu/?page_id=459
The 2013_IEEE_GRSS_DF_Contest_Samples_VA.txt in this repo is exported from original 2013_IEEE_GRSS_DF_Contest_Samples_VA.roi.
The dataset was collected by NCALM at the University of Houston (UH) in June 2012, covering the University of Houston campus. The data was prepared and pre-processed with the assistance of Xiong Zhou, Minshan Cui, Abhinav Singhania and Dr. Juan Carlos Fernández Díaz.
The Data Fusion Technical Committee would like to express its great appreciation to NCALM for providing the data, to UH students, staff and faculty for preparing the data, and to GRSS and DigitalGlobe Inc. for their continuous support in providing funding and resources for the Data Fusion Contest.

Muufl:
https://github.com/GatorSense/MUUFLGulfport
Note: If this data is used in any publication or presentation the following reference must be cited:
P. Gader, A. Zare, R. Close, J. Aitken, G. Tuell, “MUUFL Gulfport Hyperspectral and LiDAR Airborne Data Set,” University of Florida, Gainesville, FL, Tech. Rep. REP-2013-570, Oct. 2013.
If the scene labels are used in any publication or presentation, the following reference must be cited:
X. Du and A. Zare, “Technical Report: Scene Label Ground Truth Map for MUUFL Gulfport Data Set,” University of Florida, Gainesville, FL, Tech. Rep. 20170417, Apr. 2017. Available: http://ufdc.ufl.edu/IR00009711/00001.
If any of this scoring or detection code is used in any publication or presentation, the following reference must be cited:
T. Glenn, A. Zare, P. Gader, D. Dranishnikov. (2016). Bullwinkle: Scoring Code for Sub-pixel Targets (Version 1.0) [Software]. Available from https://github.com/GatorSense/MUUFLGulfport/.

Trento:
Dafault url of Trento dataset is https://github.com/tyust-dayu/Trento/tree/b4afc449ce5d6936ddc04fe267d86f9f35536afd

About GitHub hosted dataset in rs-fusion-datasets-dist:
All datasets are public available for download but I can't find any direct link for automatically loading (for example, the author uploads it via net disk apps).
The suffix of dataset is only an 3-character UID. I upload these dataset AS IS, without editing anything, and make sure it is just a mirror.
`augsburg-ouc`: From https://github.com/oucailab/DCMNet/
`berlin-ouc`: From https://github.com/oucailab/DCMNet/
`houston2018-ouc`: From https://github.com/oucailab/DCMNet/
`houston2013-mmr`: From: https://github.com/likyoo/Multimodal-Remote-Sensing-Toolkit/

Inspiration
This project is inspired by torchgeo and torchrs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rs_fusion_datasets-0.18.4.tar.gz (328.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rs_fusion_datasets-0.18.4-py3-none-any.whl (50.2 kB view details)

Uploaded Python 3

File details

Details for the file rs_fusion_datasets-0.18.4.tar.gz.

File metadata

  • Download URL: rs_fusion_datasets-0.18.4.tar.gz
  • Upload date:
  • Size: 328.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for rs_fusion_datasets-0.18.4.tar.gz
Algorithm Hash digest
SHA256 c6014d501c90904bd318f9c0b346978bc9f6b1c9a9c52b00b911b1cd886a6381
MD5 3c0247b4da9eabd02b7e0b7cae08e632
BLAKE2b-256 eab3c2bff0fce98e3a190539036dc427e36373e6092954960b58f90461d30a89

See more details on using hashes here.

File details

Details for the file rs_fusion_datasets-0.18.4-py3-none-any.whl.

File metadata

  • Download URL: rs_fusion_datasets-0.18.4-py3-none-any.whl
  • Upload date:
  • Size: 50.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for rs_fusion_datasets-0.18.4-py3-none-any.whl
Algorithm Hash digest
SHA256 f8d659e20f4051f41b33f45940bf981d98b7243e3524a4bd8e1724e553ea9871
MD5 54d13b797610dd5d13540c65dd21c0df
BLAKE2b-256 1adc62da1d3b0be9baeaa5cc9b59e7829684824716390ab5e2c27981fc2cffa9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page