Skip to main content

Solafune internal geodata management tools

Project description

solafune_tools: Internal Geodata Creation and Management Tools

This package contains tools to download STAC catalogs and Sentinel-2 imagery from Planetary Computer and assembling it into a cloudless mosaic. Other tools will be added in the future.

Quickstart

Install the package using pip or uv pip, recommend using python 3.10:

uv pip install solafune_tools

All public-facing functions have detailed docstrings explanining their expected inputs and outputs. You can check any of them through print(solafune_tools.function_name.__doc__) (if you don't use print it shows as an unstructured string) or ??solafune_tools.function_name in jupyter notebooks. Before using the library, you can set the directory where you want to store data by calling

solafune_tools.set_data_directory(dir_path="your_data_dir_here")

The above command sets the environment variable solafune_tools_data_dir from where all sub-modules draw their file paths. It is not set persistenly (i.e., not written to .bashrc or similar), so you will need to set it each time you ssh into your machine or on reboot. If you do not explicitly set this, it will default to creating/using a data folder within your current working directory.

A one-shot command exists to make a cloudless mosaic given a daterange and area of interest. Before running this function, create a Dask server and client. This function uses lazy chunked xarray Dataarrays which can (and should) be processed in parallel. The simplest way to do so is to open a Jupyter notebook and paste the following code into it. If you call this from within a python script, you need to put it under a if __name__ == "__main__": block to work.

from dask.distributed import Client, LocalCluster

cluster = LocalCluster()
client = Client(cluster)
client

It will print out a dashboard link for your cluster that you can use to track the progress of your function. The actual function call is below.

mosaics_catalog = solafune_tools.create_basemap(
    start_date="2023-05-01",
    end_date="2023-08-01",
    aoi_geometry_file="data/geojson/xyz_bounds.geojson",
    bands="Auto",
    mosaic_epsg="Auto",
    mosaic_resolution=100,
    clip_to_aoi=True,
)

If you want your mosaic broken up into tiles, pass in a tile_size argument (size in pixels). Tiles for the below call will be 100x100 except that the right and bottom boundaries of the mosaic where they maybe rectangular and smaller due to the mosaic not accomodating an integer number of tiles. You can also pass a list for bands like bands = ['B02','B04'] if you want to select only certain bands to make a mosaic. Further, you can choose to make several single band mosaics or a multiband mosaic by passing in Singleband or Multiband to this function.

mosaics_catalog = solafune_tools.create_basemap(
    start_date="2023-05-01",
    end_date="2023-08-01",
    aoi_geometry_file="data/geojson/xyz_bounds.geojson",
    bands=['B02','B04'],
    mosaic_epsg="Auto",
    mosaic_resolution=100,
    clip_to_aoi=True,
    tile_size=100,
    mosaic_style='Multiband',
)

The output is a link to a STAC catalog of all mosaics generated so far in the current data directory. See point 6 in the workflow below to see how to load and query it.

A typical workflow to assemble a cloudless mosaic is as follows. I strongly recommend leaving all outfile and outdirectory naming to 'Auto' if you choose to run these functions one by one.

  1. Get the Sentinel-2 catalog items for your area of interest (pass in a geojson) and a date range.
plc_stac_catalog = solafune_tools.planetary_computer_stac_query(
    start_date="2023-05-01",
    end_date="2023-08-01",
    aoi_geometry_file= "data/geojson/xyz_bounds.geojson",
    outfile_name='Auto'
)
  1. Download files for the bands you want for these catalog items.
tiffile_dir = solafune_tools.planetary_computer_fetch_images(
    dataframe_path=plc_stac_catalog,
    bands=["B02", "B03", "B04"],
    outfile_dir='Auto',
)
  1. Assemble a STAC catalog of local files (this is necessary for mosaicking)
local_stac_catalog = solafune_tools.create_local_catalog_from_existing(
    input_catalog_parquet=plc_stac_catalog,
    bands=["B04", "B03", "B02"],
    tif_files_dir=tiffile_dir,
    outfile_dir='Auto',
)
  1. Make a cloudless mosaic. Make sure to have a Dask cluster running for this step. Otherwise, it will either take days to finish or crash out with memory errors. Only pass in a geometry file if you want to your mosaic clipped to that geometry.
mosaic_file_loc = solafune_tools.create_mosaic(
    local_stac_catalog=local_stac_catalog,
    aoi_geometry_file=None,
    outfile_loc='Auto',
    out_epsg='Auto',
    resolution=100,
    bands='Auto',
    mosaic_style='Multiband'
)
  1. Update the STAC catalog for the mosaics folder.
mosaics_catalog = solafune_tools.create_local_catalog_from_scratch(
    infile_dir='Auto',
    outfile_loc='Auto'
    )
  1. The STAC catalog contains the geometry, date range and bands for each mosaic tif stored in the directory. Now you can query the catalog by loading it as a Geopandas.geodataframe and filtering for various conditions. The links for each mosaic are stored under the column assets under the dictionary key mosaic followed by href.
geodataframe = solafune_tools.get_catalog_items_as_gdf(mosaics_catalog)
your_query = geodataframe.geometry.intersects(your_roi_geometry) & (geodataframe['datetime']=='2021-03-01')
results = geodataframe[your_query]
your_mosaic_tif_locs = [asset['mosaic']['href'] for asset in results.assets]
# merge your mosaic tifs, do windowed reads, whatever else you need

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

solafune-tools-0.1.18.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

solafune_tools-0.1.18-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file solafune-tools-0.1.18.tar.gz.

File metadata

  • Download URL: solafune-tools-0.1.18.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for solafune-tools-0.1.18.tar.gz
Algorithm Hash digest
SHA256 bb5167dbcbd88b3fa046c901d01df5fdbda04b34704146adfdd62a46eec7216d
MD5 0410348c70c1dbd5b2d78631c4d51a89
BLAKE2b-256 38d96da2ba1d273920d5ca69b0afc1eebc1214aeebfd53865e6fbe172bca9d79

See more details on using hashes here.

File details

Details for the file solafune_tools-0.1.18-py3-none-any.whl.

File metadata

File hashes

Hashes for solafune_tools-0.1.18-py3-none-any.whl
Algorithm Hash digest
SHA256 770b916a796beb4dca60f3bed6a05dbd47f2f2396e264653380df16f1a94c51f
MD5 61ffe2ef8e8c5cda80374ac6c260062b
BLAKE2b-256 3ec3d2ab3d08312378dcd718092bed876bf2bb35d4c87236035d9ff963ac3094

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page