Skip to main content

A package to read, process and export GeoTIFFs.

Project description

Python-for-Remote-Sensing-and-GIS

pyrsgis enables the user to read, process and export GeoTIFFs. The module is built on the GDAL library, but is much more convenient when it comes to reading and exporting GeoTIFs. There are several other functions available in this package that ease raster pre-processing, currently focused on machine learning applications.

pyrsgis also supports reading satellite data directly from .tar.gz files. However, reading from .tar.gz files is currently in its beta phase. Please do not use this package for commercial purpose without my explicit permission. Researchers/ academicians are welcome for providing feedback and getting technical support. Since this is an open-source volunatry project, collaborations are most welcome. Please write to me at pratkrt@gmail.com or connect to me at LinkedIn or Twitter.

To install using pip, see the PyPI page - link
To install using conda, see the Anaconda page - link

Recommended citation:
Tripathy, P. pyrsgis: A Python package for remote sensing and GIS. V0.3. Available at https://pypi.org/project/pyrsgis/.

Sample code (click to expand)

1. Reading .tif extension file

Import the module and define the input file path.

from pyrsgis import raster

file_path = r'D:/your_file_name.tif'
  • To read all the bands of a stacked satellite image:
ds, arr = raster.read(file_path, bands='all')

where, ds is the data source similar to GDAL and arr is the numpy array that contains all the bands of the input raster. The arr can be 2D or 3D depending on the input data. One can check the shape of the array using the print(arr.shape) command. The bands argument in the raster.read function defaults to 'all'.

  • To read a list of bands of a stacked satellite image:
ds, arr = raster.read(file_path, bands=[2, 3, 4])

Passing the band numbers in a list returns bands 2, 3 & 4 as three-dimensional numpy array.

  • To read a specific band from stacked satellite image:
ds, arr = raster.read(file_path, bands=2)

Passing a single band number returns that particular band as two-dimensional numpy array.

  • To read a single band TIF file:
ds, arr = raster.read(file_path)

Since the bands argument defaults to 'all', this will read all the bands in the input file, here, one band only.

2. Exporting .tif extension file

In all the below examples, it is assumed that the number of rows and columns, and the cell size of the input and output rasters are the same. All these are stored in the `ds` variable, please see details here: link.

  • To export all bands of a 3D array:
out_file_path = r'D:/sample_file_all_bands.tif'
raster.export(arr, ds, out_file_path, dtype='int', bands='all')

The dtype argument in the above function by default is set to 'default', which is 'int'--16-bit integer. If the data type in the provided ds is not int and the paramter is set to default, then the data type of the ds will be used. If there is a disagreement and the dtype argument is explicitly specified, it will override the data type of ds. Please be careful to change this while exporting arrays with large values. Similarly, to export float type array (eg. NDVI), use dtype = 'float'. Data type of high pixel-depth, e.g. Integer32, Integer64, or float type uses more space on hard drive, so the default has been set to integer. To export any float datatype, the argument should be passed explicitly.
These are the options for the dtype argument: byte, cfloat32, cfloat64, cint16, cint32, float32, float64, int16, int32, uint8, uint16, uint32.
The NoData value can be explicitly defined using the nodata parameter, this defaults to -9999.

  • To export a list of bands of a 3D array:
out_file_path = r'D:/sample_file_bands_234.tif'
raster.export(arr, ds, out_file_path, bands=[2, 3, 4])
  • To export any one band of a 3D array:
out_file_path = r'D:/sample_file_band_3.tif'
raster.export(arr, ds, out_file_path, bands=3)
  • To export a single band array:
out_file_path = r'D:/sample_file.tif'
raster.export(arr, ds, out_file_path)

where, arr should be a 2D array.

  • Export the raster with compression:
    Compression type can also be defined while exporting the raster by using the compress parameter. LZW. DEFLATE, etc. are a couple of options. Defaults to None.

  • Example with all default parameters:

out_file_path = r'D:/sample_file.tif'
raster.export(band, ds, filename='pyrsgis_outFile.tif', dtype='int', bands=1, nodata=-9999, compress=None)

3. Converting TIF to CSV

GeoTIFF files can be converted to CSV files using *pyrsgis*. Every band is flattened to a single-dimensional array, and converted to CSV. These are very useful for statistical analysis.
Import the function:

from pyrsgis.convert import rastertocsv

To convert all the bands present in a folder:

your_dir = r"D:/your_raster_directory"
out_file_path = r"D:/yourFilename.csv"

rastertocsv(your_dir, filename=out_file_path)

Generally the NoData or NULL values in the raster become random negative values, negatives can be removed using the negative argument:

rastertocsv(your_dir, filename=out_file_path, negative=False)

At times the NoData or NULL values in raster become '127' or '65536', they can also be removed by declaring explicitly.

rastertocsv(your_dir, filename=out_file_path, remove=[127, 65536])

This is a trial and check process, please check the generated CSV file for such issues and handle as required.

Similarly, there are bad rows in the CSV file, representing zero value in all the bands. This takes a lot of unnecessary space on drive, it can be eliminated using:

rastertocsv(your_dir, filename=out_file_path, badrows=False)

4. Creating northing and easting using a reference raster

pyrsgis allows to quickly create the northing and easting rasters using a reference raster, as shown below:

The cell value in the generated rasters are the row and column number of the cell. To generate these GeoTIFF files, start by importing the function:

from pyrsgis.raster import northing, easting

reference_file_path = r'D:/your_reference_raster.tif'

northing(reference_file_path, outFile= r'D:/pyrsgis_northing.tif', flip=True)
easting(reference_file_path, outFile= r'D:/pyrsgis_easting.tif', flip=False)

As the name suggests, the flip argument flips the resulting rasters.
The value argument defaults to number. It can be changed to normalised to get a normalised layer. The other option for value argument is coordinates, which produces the raster layer with cell centroids. Please note that if the value argument is set to normalised, it will automatically adjust the flip value, i.e. False, both in easting and northing functions. Similarly, the dtype parameter auto-adjusts with the data type, but can be changed to a higher pixel depth when value argument is number. Example with all parameters:

northing(reference_file_path, outFile='pyrsgis_northing.tif', flip=True, value='number', dtype='int16')
easting(reference_file_path, outFile='pyrsgis_easting.tif', flip=False, value='number', dtype='int16')

5. Shifting raster layers

You can shift the raster layers using either the 'shift' or 'shift_file' function. The 'shift' function allows to shift the raster in the backend, whereas, the 'shift_file' directly shifts the GeoTIF file and stores another file.

To shift in the backend:

from pyrsgis import raster

# Define the path to the input file and get the data source
infile = r"D:/path_to_your_file/input.tif"
ds, arr = raster.read(infile)

# Define the amount of shift required
delta_x = 15
delta_y = 11.7

# shift the raster
shifted_ds = raster.shift(ds, x=delta_x, y=delta_y, shift_type)

# if you wish to export
raster.export(arr, ds, out_file, dtype='int', bands=1, nodata=-9999)

Here, 'ds' is the data source object that is created when the raster is read using 'raster.read' command. 'x' and 'y' are the distance for shifting the raster. The 'shift_type' command let's you move the raster either by the raster units or number of cells, the valid options are 'unit' and 'cell'. By default, the 'shift_type' is 'unit'.

To shift a GeoTIFF file:

from pyrsgis import raster

# Define the path to the input and output file
infile = r"D:/path_to_your_file/input.tif"
outfile = r"D:/path_to_your_file/shifted_output.tif"

# Define the amount of shift required
delta_x = 15
delta_y = 11.7

# shift the raster
raster.shift_file(infile, x=delta_x, y=delta_y, outfile=outfile, shift_type='unit', dtype='uint16')

Most of the parameters are same as the 'shift' function. The 'dtype' parameter is same as used in the 'raster.export' function.

6. Create image chips for Convolutional Neural Network (CNN)

CNNs require image chips for training and prediction. Remote sensing images are large sized two or three-dimesional images, this module enables creating image chips directly from TIF files or arrays. The input data and size of image chips are required.

To create image chips from array:

from pyrsgis import raster
from pyrsgis.ml import imageChipsFromArray

# read the TIF file(s) (both are of different sizes - for demonstration)
single_band_file = r'path/to/single_band.tif'
multi_band_file = r'path/to/multi_band.tif' # this is a Landsat 5 TM image (7 bands stacked)

# read the files as array using pyrsgis raster.read module
_, single_band_array = raster.read(single_band_file)
_, multi_band_array = raster.read(multi_band_file)

# create image chips
single_band_chips = imageChipsFromArray(single_band_array, x_size=5, y_size=5))
multi_band_chips = imageChipsFromArray(multi_band_array, x_size=5, y_size=5))

print(single_band_chips.shape)
print(multi_band_chips.shape)

The output:

(91125, 5, 5)
(987552, 5, 5, 7)

Image chips can also be generated directly from TIF files using following:

from pyrsgis.ml import imageChipsFromFile

# read the TIF file(s) (both are of different sizes - for demonstration)
single_band_file = r'path/to/single_band.tif'
multi_band_file = r'path/to/multi_band.tif' # this is a Landsat 5 TM image (7 bands stacked)

# create image chips
single_band_chips = imageChipsFromFile(single_band_file, x_size=5, y_size=5))
multi_band_chips = imageChipsFromFile(multi_band_file, x_size=5, y_size=5))

print(single_band_chips.shape)
print(multi_band_chips.shape)

This will result in the same output as the one above.

7. Reading directly from .tar.gz files (beta)

Currently, only Landsat data is supported.

import pyrsgis

file_path = r'D:/your_file_name.tar.gz'
your_data = pyrsgis.readtar(file_path)

The above code reads the data and stores in the your_data variable.

Various properties of the raster can be assessed using the following code:

print(your_data.rows)
print(your_data.cols)

This will display the number of rows and columns of the input data.

Similarly, the number of bands can be checked using:

print(your_data.nbands)

On reading the .tar.gz files directly, pyrsgis determines the satellite sensor. This can be checked using:

print(your_data.satellite)

This will display the satellite sensor, for instance, Landsat-5, Landsat-8, etc.

If the above code shows the correct satellite sensor, then the list of band names of the sensor (in order) can easily be checked using:

print(your_data.bandIndex)

Any particular band can be extarcted using:

band_number = 1
your_band = your_data.getband(band_number)

The above code returns the band as array which can be visualised using:

display(your_band, maptitle='Title of your image', cmap='PRGn')

or, directly using:

band_number = 1
display(your_data.getband(band_number), maptitle='Title of your image', cmap='PRGn')

The generated map can directly be saved as an image.

The extracted band can be exported using:

out_file_path = r'D:/sample_output.tif'
your_data.export(your_band, out_file_path)

This saves the extracted band to the same directory.

To export the float type raster, please define the datatype explicitly, default is 'int':

your_data.export(your_band, out_file_path, datatype='float')

The NDVI (Normalised Difference Vegetaton Index) can be computed easily.

your_ndvi = your_data.ndvi()

Normalised difference index between any two bands can be computed using:

norm_diff = your_data.nordif(bandNumber2, bandNumber1)

This computes (band2-band1)/(band2+band1) in the back end and returns a numpy array. The resulting array can be exported using:

out_file_path = r'D:/your_ndvi.tif'
your_data.export(your_ndvi, out_file_path, datatype='float')

Be careful with the float type of NDVI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrsgis-0.4.0rc1.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

pyrsgis-0.4.0rc1-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file pyrsgis-0.4.0rc1.tar.gz.

File metadata

  • Download URL: pyrsgis-0.4.0rc1.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.7.9

File hashes

Hashes for pyrsgis-0.4.0rc1.tar.gz
Algorithm Hash digest
SHA256 19c663969fd0c86544c2d319e0d3b3aa061f20b7eae4081c69b7333fb11628a8
MD5 90b196581575399a96e5d7c7b388a76e
BLAKE2b-256 362afbff9edd43c9736ea5f4e452f0d034955540f1d23118af28e34326acab72

See more details on using hashes here.

File details

Details for the file pyrsgis-0.4.0rc1-py3-none-any.whl.

File metadata

  • Download URL: pyrsgis-0.4.0rc1-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.7.9

File hashes

Hashes for pyrsgis-0.4.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 3a719da6d45a2013fc0707ddd8d328f75e91c6413c720162226e74c0b79788bb
MD5 e4601ec63f34e3314e9803d251a51880
BLAKE2b-256 cf8290654a0d80bda048f1581601cd566155d24aae7ee0f73161697d39e32d50

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page