Dealing with Geopspatial Data, faster computation of ndvi indices using dask
Project description
Geodata-preprocess-IIITB-SCL
This is official documaentation of geodata-preprocess-IIITB-SCL. The package has many useful functions for dealing with geospatial data, also few functions like computation of NDVI, MNDWI are integrated with dask to speed up it's computation.
Installation
This can installed using pip using the following command in both Windows and Linux OS
$ pip install geodata-preprocess-IIITB-SCL
Usage
All Sorted File Names
The geospatial data file names are moslty represented by date and time. For few specific tasks like time series forecasting , it might be necessary to get all files in sequential form. This function returns list of all file names in sorted order.
Function
get_file_names(folder_path)
Parameters
- folder_path: Folder path where geospatial Images exist
Return Type
Ordered name list of Geospatial Images
Number of Bands
The functions finds number of bands in the image. Red, Green, Blue, Infrared etc.
Function
number_of_bands(filepath)
Parameters
- file path: Path of Image.
Return Type
Integer value, number of bands.
Numpy Array of the Image
The functions converts the file in numpy array format with all it's bands.
Function
numpy_image(filepath):
Parameters
- file path: Path of Image.
Return Type
Numpy Array.
Dataframe of the Image
Converts the geospatial file in pandas dataframe.
Function
dataframe_image(filepath)
Parameters
- file path: Path of Image.
Return Type
Pandas dataframe.
Min Max Scaling of Dataframe
This functions performs min-max scaling of the dataframe.
Function
min_max_scaled(df_raw)
Parameters
- df_raw: Input pandas dataframe.
Return Type
Numpy array representing scaled values.
Convert the numpy to dask array
This function converts numpy array to dask array with specified chunks of the same bandwidth.
Function
numpy_to_dask_array(df,chunk_len)
Parameters
- df: Input dataframe
- chunk_len:specifies the chunk size
Return Type
Dask array.
One hot to label
Some of the geospatial data may be segmented (each pixel being classified to a label). Generally the open source labelled data is one hot encoded. This functions converts the it in labelled form.
Function
one_hot_to_label(file_path)
Parameters
- file path: Path of Image.
Return Type
Numpy array representing labelled data with only one band.
Ordered labels
Some of the labels of an image might not be following a sequential form. For eg there is bunch of images whose pixel labels are from 2,4, 7. To make it sequential this function would be helpful
Function
get_ordered_labels(y)
Parameters
- _y: Labelled numpy array.
Return Type
Ordered numpy array.
Normalized difference
This is a key functions used for NDVI and MNDWI indices. With specifying band values as Red and Near Red Infrared bands we can find NDVI index , and by specifying Short Wave Infrared and Green bands whe can get MNDWI index for any geospatial image.
Function
normalized_difference( b1, b2):
NDVI Computation (Returning list)
Functions here are used for finding NDVI indices of list of geospatial image
Without Dask
Function
find_ndvi_list(file_path_list)
Parameters
- file path_list: List of path of Images.
Return Type
List of NDVI index (numpy array) in the same order of values in input list.
With Dask
Function
find_ndvi_list_with_dask(worker_nodes,file_path_list)
Parameters
- file path_list: List of path of Images.
- worker_nodes: Number of dask worker nodes in a cluster
Return Type
List of NDVI index (numpy array) in the same order of values in input list.
NDVI Computation (Saving the values in folder)
Without Dask
Function
find_and_write_ndvi_list(file_path_list,destination_folder)
Parameters
- file path_list: List of path of Images.
- destination_folder: path where indices will be saved.
Return Type
None
With Dask
Function
find_and_write_ndvi_list_with_dask(worker_nodes,file_path_list,destination_folder)
Parameters
- file path_list: List of path of Images.
- worker_nodes: Number of dask worker nodes in a cluster
- destination_folder: path where indices will be saved.
Return Type
None
Contributing
The following are the core contributors:
- Pratyush Upadhyay
- Deeksha Agarwal
License
geodata-preprocess-IIITB-SCL
was created by IITB-SCL. It is licensed under the terms of the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for geodata-preprocess-IIITB-SCL-0.1.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 219c161553abc3a69e5d1184c0c2b453a5b27442d9e9918c594403d0175bce8a |
|
MD5 | 2bc5e9433c8353dba35d8f6757a52741 |
|
BLAKE2b-256 | e3632fa5236f82e8419db8b6a02eca6050b126b5d90a9aa7c4f7cb7a82162e04 |
Hashes for geodata_preprocess_IIITB_SCL-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a08a747898c4bb796540f61021f55cec5976db67eb108cfe2c6eac507403cb4d |
|
MD5 | 2a204fdb598e0dd2e82ecf3ae9819d86 |
|
BLAKE2b-256 | eaa6a0e63c5caa5a780618a963c71cf1d5be9b9e3af6b77f7e0d9e664773fd41 |