API for accessing HSI datasets
Project description
Install
pip install HSI-Dataset-API
Links to the available HSI datasets
- Nextcloud: HSI Dataset v1.3.zip
- Google Drive: HSI Dataset v1.3.zip
Dataset structure
Dataset should be stored in the following structure:
Plain structure (#1)
{dataset_name} ├── hsi │ ├── 1.npy │ └── 1.yml ├── masks │ └── 1.png └── meta.yaml
Or in structure like this (such structure was created while using data cropping)
Cropped data structure (#2)
{dataset_name} ├── hsi │ ├── specter_1 │ │ ├── 1.npy │ │ ├── 1.yml │ │ ├── 2.npy │ │ └── 2.yml │ └── specter_2 │ ├── 1.npy │ └── 1.yml ├── masks │ ├── specter_1 │ │ ├── 1.png │ │ └── 2.png │ └── specter_2 │ └── 1.png └── meta.yaml
Meta.yml
In this file you should provide classes description (it's name and label). Also, you can store any helpful information that describes the dataset.
For example:
name: HSI Dataset example
description: Some additional info about dataset
classes:
cat: 1
dog: 2
car: 3
wave_lengths:
- 420.0
- 640.0
- 780.0
{number}.yml
In this file you can store HSI specific information such as date, name of humidity.
For example:
classes:
- potato
height: 512
width: 512
layersCount: 237
original_filename: '210730_134940_'
top_left:
- 0
- 0
Python API
Via API presented in this repo you can access the dataset.
Importing
from hsi_dataset_api import HsiDataset, HsiDataCropper
Cropping the data
base_path = '/mnt/data/corrected_hsi_data'
output_path = '/mnt/data/cropped_hsi_data'
classes = ['potato', 'tomato']
selected_folders = ['HSI_1', 'HSI_2'] # Completely optional
cropper = HsiDataCropper(side_size=512, step=8, objects_ratio=0.20, min_class_ratio=0.01)
cropper.crop(base_path, output_path, classes, selected_folders)
Plot cropped data statistics
cropper.draw_statistics()
Using the data
Create Data Access Object
dataset = HsiDataset('../example/dataset_example', cropped_dataset=False)
Parameter cropped_dataset
controls type of the dataset structure. If the dataset persist in the memory in
the structure like second (#2) - set this parameter to True
Getting the dataset meta information
dataset.get_dataset_description()
Getting the shuffled train data using python generator
for data_point in dataset.data_iterator(opened=True, shuffle=True):
hyperspecter = data_point.hsi
mask = data_point.mask
meta = data_point.meta
Examples
See jupyter notebook example by the following link:
https://nbviewer.org/github/Banayaki/hsi_dataset_api/blob/master/examples/ClassificationMLP.ipynb
Source code
Source code is available:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file HSI_Dataset_API-1.5.3.tar.gz
.
File metadata
- Download URL: HSI_Dataset_API-1.5.3.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 28d40b77905d0102097d1ca4042fb4c35545f884d26c9f9546cae937c8cd07a0 |
|
MD5 | afe06698a577d42b1463177547b00c94 |
|
BLAKE2b-256 | c855206dd5203a515a721185a4ad465416ed570fa81ec76fdb738e5683f5789c |
File details
Details for the file HSI_Dataset_API-1.5.3-py3-none-any.whl
.
File metadata
- Download URL: HSI_Dataset_API-1.5.3-py3-none-any.whl
- Upload date:
- Size: 8.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5774fab65eeab5290ad4b609b9e9f570494618538db3bed392e977b691af649d |
|
MD5 | f71f59be3925a42907975a0a51facb6c |
|
BLAKE2b-256 | 430fa13e8984273d61af8c0fed07df079425e94ab8bcba7e8860e3529b99809b |