Skip to main content

Solafune internal geodata management tools

Project description

solafune_tools: Internal Geodata Creation and Management Tools

This package contains tools to download STAC catalogs and Sentinel-2 imagery from Planetary Computer and assembling it into a cloudless mosaic. Other tools will be added in the future.

Quickstart

Install the package using pip or uv pip, recommend using python 3.10:

uv pip install solafune_tools

Before using the library, you can set the directory where you want to store data by calling

solafune_tools.set_data_directory(dir_path="your_data_dir_here")

The above command sets the environment variable solafune_tools_data_dir from where all sub-modules draw their file paths. It is not set persistenly (i.e., not written to .bashrc or similar), so you will need to set it each time you ssh into your machine or on reboot. If you do not explicitly set this, it will default to creating/using a data folder within your current working directory.

A one-shot command exists to make a cloudless mosaic given a daterange and area of interest. Before running this function, create a Dask server and client. This function uses lazy chunked xarray Dataarrays which can (and should) be processed in parallel. The simplest way to do so is to open a Jupyter notebook and paste the following code into it. If you call this from within a python script, you need to put it under a if __name__ == "__main__": block to work.

from dask.distributed import Client, LocalCluster

cluster = LocalCluster()
client = Client(cluster)
client

It will print out a dashboard link for your cluster that you can use to track the progress of your function. The actual function call is below.

mosaics_catalog = solafune_tools.create_basemap(
    start_date="2023-05-01",
    end_date="2023-08-01",
    aoi_geometry_file="data/geojson/xyz_bounds.geojson",
    bands=["B02", "B03", "B04"],
    mosaic_epsg="Auto",
    mosaic_resolution=100,
)

The output is a link to a STAC catalog of all mosaics generated so far in the current data directory. See point 6 in the workflow below to see how to load and query it.

A typical workflow to assemble a cloudless mosaic is as follows. I strongly recommend leaving all outfile and outdirectory naming to 'Auto' if you choose to run these functions one by one.

  1. Get the Sentinel-2 catalog items for your area of interest (pass in a geojson) and a date range.
plc_stac_catalog = solafune_tools.planetary_computer_stac_query(
    start_date="2023-05-01",
    end_date="2023-08-01",
    aoi_geometry_file= "data/geojson/xyz_bounds.geojson",
    outfile_name='Auto'
)
  1. Download files for the bands you want for these catalog items.
tiffile_dir = solafune_tools.planetary_computer_fetch_images(
    dataframe_path=plc_stac_catalog,
    bands=["B02", "B03", "B04"],
    outfile_dir='Auto',
)
  1. Assemble a STAC catalog of local files (this is necessary for mosaicking)
local_stac_catalog = solafune_tools.create_local_catalog_from_existing(
    input_catalog_parquet=plc_stac_catalog,
    bands=["B04", "B03", "B02"],
    tif_files_dir=tiffile_dir,
    outfile_dir='Auto',
)
  1. Make a cloudless mosaic. Make sure to have a Dask cluster running for this step. Otherwise, it will either take days to finish or crash out with memory errors.
mosaic_file_loc = solafune_tools.create_mosaic(
    local_stac_catalog=local_stac_catalog,
    aoi_geometry_file="data/geojson/xyz_bounds.geojson",
    outfile_loc='Auto',
    out_epsg='Auto',
    resolution=100,
)
  1. Update the STAC catalog for the mosaics folder.
mosaics_catalog = solafune_tools.create_local_catalog_from_scratch(
    infile_dir=os.path.dirname(mosaic_file_loc),
    outfile_loc='Auto'
    )
  1. The STAC catalog contains the geometry, date range and bands for each mosaic tif stored in the directory. Now you can query the catalog by loading it as a Geopandas.geodataframe and filtering for various conditions. The links for each mosaic are stored under the column assets under the dictionary key mosaic followed by href.
geodataframe = solafune_tools.get_catalog_items_as_gdf(mosaics_catalog)
your_query = geodataframe.geometry.intersects(your_roi_geometry) & (geodataframe['datetime']=='2021-03-01')
results = geodataframe[your_query]
your_mosaic_tif_locs = [asset['mosaic']['href'] for asset in results.assets]
# merge your mosaic tifs, do windowed reads, whatever else you need

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

solafune-tools-0.1.13.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

solafune_tools-0.1.13-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file solafune-tools-0.1.13.tar.gz.

File metadata

  • Download URL: solafune-tools-0.1.13.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for solafune-tools-0.1.13.tar.gz
Algorithm Hash digest
SHA256 f1d669fe57f03f01bbdde78b3fd0c279c1afa1b80371b19d1582292cf748b11f
MD5 fca7854cf36ebedb8eb8f850ec059ee0
BLAKE2b-256 540822b9d087bb53a132bf44f93bd62cadafe84714f4661daffc5db82b619014

See more details on using hashes here.

File details

Details for the file solafune_tools-0.1.13-py3-none-any.whl.

File metadata

File hashes

Hashes for solafune_tools-0.1.13-py3-none-any.whl
Algorithm Hash digest
SHA256 75638290e31cb0b8a9577e548be0e2bc7720461ed9ae83fa0d610d03dd53705d
MD5 cb72813ae04ae7f76088a3efe54c1974
BLAKE2b-256 387d2ce78b4a31f59da6a7198bb28c731672c605fee370964fc3c00c19ebca96

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page