Skip to main content

Satellie extraction and cropper

Project description

Introduction

This python code is intended to automate/make easier the data extraction and cropping of satellite image data from the netherlands space office (NSO). NSO provides free satellite images from the Netherlands, but a downside is that the NSO does not provide satellite images fitted to a specific area but only a large overlapping region. This leads to a unnecessary large amount of data especially if you only want to study a smaller specific region and as such cropping is needed, which is what this package provides.

This python code does the following steps:

  1. Searches the NSO for satellite images which contain a selected geoarea in .geojson file. Parameters can used for how strict this containment should be.
  2. Downloads, unzips and crops the satellite image, found in step 1, to the selected area.
  3. An option can also be set for calculating the Normalized difference vegetation index (NVDI, used in for example crop analysis) or normalisation of the cropped region.
  4. Saves the cropped satellite image to a .tif file with the option to also save it as a geopandas dataframe. And deletes the unused data.

This image gives a illustration: Alt text

If you only need a few satellite files, the data portal of the NSO should be enough: https://www.satellietdataportaal.nl/. Although you still need to crop the satellite image by hand.

Depending on your purpose however, for example machine learning, you want to have as much satellite images (in a time series) and automate as possible, for which this python code is also intended.

*This satellite data is only intended for dutch legal entities, dutch institutions or dutch citizens. For the license terms of the NSO see this links: https://www.spaceoffice.nl/nl/satellietdataportaal/toegang-data/licentievoorwaarden/

Getting Started

  1. Get a NSO account, register at https://satellietdataportaal.nl/register.php
  2. First get a GeoJSON file of the selected region you want to study and be cropped to. Geojson.io can you help you with that. Note the coordinates have to be in WGS84! ( Which should be standard for a geojson.)
  3. Make a instance of nso_georegion with instance of the geojson region you have, where you want to store the cropped files and the NSO account based on step 0.
  4. Retrieve download links of satellite images which contain the selected region, parameters can be set to how strict this containment should be.
  5. Download, unzip and crop the found links.

Example code

# This the way the import nso.
import satellite_images_nso.api.nso_georegion as nso
# The sat_manipulator gives other handy transmations on satellite data .tif files to a geopandas dataframe.
import satellite_images_nso.api.sat_manipulator as sat_manipulator

path_geojson = "/src/example/example.geojson"
output_folder = "./src/output/"
# The first parameter is the path to the geojson, the second the map where the cropped satellite data will be downloaded, the third is your NSO usernamen and the last your NSO password.
georegion = nso.nso_georegion(path_geojson,output_folder,\
                              YOUR_USER_NAME_HERE,\
                             YOUR_PASSWORD_HERE)

# This method fetches all the download links with all the satelliet images the NSO has which contain the region in the given geojson.
# Max_diff parameters represents the amount of percentage the selected region has to be in the satellite image. 
# So 1 is the the selected region has to be fully in the satellite images while 0.7 donates only 70% of the selected region is in the 
links = georegion.retrieve_download_links(max_diff = 0.7)


# This example filters out only 50 cm RGB Infrared Superview satellite imagery in the summer from all the links
season = "Summer"
links_group = []
for link in links:
            if 'SV' in link and '50cm' in link and 'RGBI' in link:
                if sat_manipulator.get_season_for_month(int(link.split("/")[len(link.split("/"))-1][4:6]))[0] == season:
                    links_group.append(link)



# Downloads a satellite image from the NSO, makes a crop out of it so it fits the geojson region and calculates the NVDI index.
# The output will stored in the output folder.
# The parameters are : execute_link(self, link, calculate_nvdi = True,  delete_zip_file = True, delete_source_files = True, check_if_file_exists = True, relative_75th_normalize = False, plot=True)
# description of these parameters can be found in the code.
georegion.execute_link(links_group[0])

# This function reads a .tif file, which is a format in which the satellite data is stored in, and converts it to a pixel based geopandas dataframe.
# Mainly for machine learning purposes.
path_to_vector = "path/to/folder/*.tif"
geo_df_pixel = sat_manipulator.tranform_vector_to_pixel_gpdf(path_to_vector)

See also the jupyter notebook in src/nso_notebook_example.ipynb

Class diagram

Alt text

Installation

Install this package with: pip install satellite_images_nso

Be sure you have installed the required packages, follow the instructions below.

Package installation

If you are a Windows user you have to install the dependencies via wheels. The wheels for the following dependencies should be downloaded from https://www.lfd.uci.edu/~gohlke/pythonlibs/:

  • GDAL>=3.0.4
  • Fiona>=1.8.13
  • rasterio>=1.1.3
  • Shapely>=1.7.0
  • geopandas>=0.9.0

These should be installed in de following order: first GDAL, then Fiona and then raterio. After these you can install the rest.

Download the wheels according to your system settings. For instance, wheel rasterio‑1.2.10‑cp39‑cp39‑win_amd64.whl is used with the 64-bit version of Windows and a 3.9 version of python. Install the wheel with pip install XXX.XX.XX.whl.

By installing the above packages, the following needed packages will be in principle automatically installed as well:

  • earthpy
  • matplotlib
  • numpy
  • objectpath
  • pandas
  • requests
  • setuptools
  • pyproj

Or else check out this stack overflow post: https://gis.stackexchange.com/questions/2276/installing-gdal-with-python-on-windows

Install GDAL on Databricks

If you are using databricks use this code to set up a init script which installs GDAL.

    dbutils.fs.mkdirs("dbfs:/databricks/FileStore/init_script")
    dbutils.fs.put("/databricks/FileStore/init_script/gdal.sh","""
        #!/bin/bash
        set -ex
        /databricks/python/bin/python -V
        ./databricks/conda/etc/profile.d/conda.sh
        conda activate /databricks/python
        conda install -y gdal""", True)

Install GDAL on MacOS

Install GDAL by using Brew:
brew install GDAL

Run as a docker container

docker run -it --entrypoint bash dockerhubpzh/satellite_images_nso_docker

See: https://hub.docker.com/r/dockerhubpzh/satellite_images_nso_docker

Local development

Run rebuild.bat to build and install package on local computer.

Author

Michael de Winter

Daniel Overdevest

Yilong Wen

Contact

Contact us at vdwh@pzh.nl

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

satellite_images_nso-1.2.1.tar.gz (26.2 kB view details)

Uploaded Source

Built Distribution

satellite_images_nso-1.2.1-py3-none-any.whl (27.9 kB view details)

Uploaded Python 3

File details

Details for the file satellite_images_nso-1.2.1.tar.gz.

File metadata

  • Download URL: satellite_images_nso-1.2.1.tar.gz
  • Upload date:
  • Size: 26.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for satellite_images_nso-1.2.1.tar.gz
Algorithm Hash digest
SHA256 eddb42ba63862e786d1ea8d2905a091ddb26dbc425f38be8ad4950d04a72e1aa
MD5 d74642dda5f5eef5c7ddea6d7fd0c627
BLAKE2b-256 4a83ceff2e823ed69e10e8519dae069161927122bf1d8cc494ae347569cf5ad5

See more details on using hashes here.

File details

Details for the file satellite_images_nso-1.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for satellite_images_nso-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 dbb218f07337d7caa6106fc15923b14e99a3afef9847253d9045d7d215a5ba59
MD5 46f9dfce219cb016589398340c1869f0
BLAKE2b-256 d66dbd3b3f5c4386bd678660709490a7381e201cf9662874bd94031b7f4d2302

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page