Skip to main content

A geospatial data processing and analysis toolkit

Project description

geosptools

Python Version License PyPI Version

geosptools is a specialised Python toolkit designed for geospatial raster data processing and analysis. Built on top of GDAL (Geospatial Data Abstraction Library), it provides robust tools for converting NetCDF files to raster formats and merging independent raster datasets. The package emphasises reliable raster processing workflows with comprehensive error handling and defensive programming practices.

Features

  • NetCDF to Raster Conversion:

    • Convert NetCDF files to various raster formats (GeoTIFF, JPEG, PNG, etc.)
    • Configurable resolution and coordinate reference systems
    • Batch processing with nested list support
    • Comprehensive error handling and validation
  • Raster Merging Operations:

    • Merge independent raster files from multiple regions
    • Synchronised processing of multi-region datasets
    • Flexible output format configuration
    • NoData value handling and projection preservation
  • GDAL Integration:

    • Professional GDAL-based processing workflows
    • Support for multiple raster formats and drivers
    • Efficient memory management and dataset handling
    • Robust error reporting and debugging capabilities
  • Defensive Programming:

    • Automatic nested list flattening for file inputs
    • Comprehensive parameter validation
    • Enhanced error handling with detailed diagnostics
    • Type safety with modern Python annotations

Installation

Prerequisites

Before installing, please ensure the following dependencies are available on your system:

  • GDAL Library (system-level installation required):

    # Ubuntu/Debian
    sudo apt-get update
    sudo apt-get install gdal-bin libgdal-dev
    
    # macOS (using Homebrew)
    brew install gdal
    
    # CentOS/RHEL
    sudo yum install gdal gdal-devel
    
  • Required Python Libraries:

    pip install gdal numpy
    

    Or via Anaconda (recommended for GDAL compatibility):

    conda install -c conda-forge gdal numpy
    
  • Internal Package Dependencies:

    pip install pygenutils paramlib
    

Installation (from PyPI)

Install the package using pip:

pip install geosptools

Development Installation

For development purposes, you can install the package in editable mode:

git clone https://github.com/yourusername/geosptools.git
cd geosptools
pip install -e .

Usage

Basic Example - NetCDF to Raster Conversion

from geosptools.raster_tools import nc2raster

# Convert a single NetCDF file to GeoTIFF
nc2raster(
    nc_file_list="temperature_data.nc",
    output_file_format="GTiff",
    raster_extension="tif",
    raster_resolution=300,
    nodata_value=-9999,
    crs="EPSG:4326"
)

# Batch convert multiple NetCDF files
nc_files = ["temp_2020.nc", "temp_2021.nc", "temp_2022.nc"]
nc2raster(
    nc_file_list=nc_files,
    output_file_format="GTiff",
    raster_extension="tif",
    raster_resolution=500
)

Advanced Example - Nested List Processing

from geosptools.raster_tools import nc2raster

# Handle complex nested file structures automatically
nested_files = [
    ["region1_temp.nc", "region1_precip.nc"],
    ["region2_temp.nc", "region2_precip.nc"],
    "global_summary.nc"
]

# Defensive programming automatically flattens nested lists
nc2raster(
    nc_file_list=nested_files,
    output_file_format="GTiff",
    raster_extension="tif",
    raster_resolution=1000,
    nodata_value=-32768,
    crs="EPSG:3857"  # Web Mercator projection
)

Multi-Region Raster Merging

from geosptools.raster_tools import merge_independent_rasters

# Define raster files for different regions
raster_data = {
    "north_region": [
        "north_temp_jan.tif",
        "north_temp_feb.tif",
        "north_temp_mar.tif"
    ],
    "south_region": [
        "south_temp_jan.tif",
        "south_temp_feb.tif",
        "south_temp_mar.tif"
    ],
    "central_region": [
        "central_temp_jan.tif",
        "central_temp_feb.tif",
        "central_temp_mar.tif"
    ]
}

# Merge corresponding files from each region
merge_independent_rasters(
    raster_files_dict=raster_data,
    output_file_format="GTiff",
    joint_region_name="combined",
    output_file_name_ext="tif",
    nodata_value=-9999
)

Climate Data Processing Example

from geosptools.raster_tools import nc2raster, merge_independent_rasters

# Step 1: Convert climate NetCDF files to rasters
climate_files = [
    "ERA5_temperature_2023.nc",
    "ERA5_precipitation_2023.nc",
    "ERA5_humidity_2023.nc"
]

nc2raster(
    nc_file_list=climate_files,
    output_file_format="GTiff",
    raster_extension="tif",
    raster_resolution=1000,
    crs="EPSG:4326"
)

# Step 2: Merge regional climate data
regional_data = {
    "europe": ["EUR_temp_2023.tif", "EUR_precip_2023.tif"],
    "asia": ["ASIA_temp_2023.tif", "ASIA_precip_2023.tif"],
    "africa": ["AFR_temp_2023.tif", "AFR_precip_2023.tif"]
}

merge_independent_rasters(
    raster_files_dict=regional_data,
    output_file_format="GTiff",
    joint_region_name="global",
    output_file_name_ext="tif"
)

Project Structure

The package is organised as a focused raster processing toolkit:

geosptools/
├── raster_tools.py              # Core raster processing functions
├── __init__.py                  # Package initialisation
├── CHANGELOG.md                 # Version history and changes
└── README.md                    # Package documentation

Key Functions

nc2raster()

Purpose: Convert NetCDF files to various raster formats using GDAL

Key Features:

  • Supports single files, lists, and nested lists of NetCDF files
  • Configurable output formats (GeoTIFF, JPEG, PNG, etc.)
  • Customisable resolution and coordinate reference systems
  • NoData value handling and projection settings
  • Comprehensive error handling and progress reporting

Parameters:

  • nc_file_list: NetCDF file(s) to convert (supports nested lists)
  • output_file_format: GDAL driver name (e.g., "GTiff", "JPEG")
  • raster_extension: Output file extension
  • raster_resolution: Resolution for output rasters
  • nodata_value: NoData value for raster files (optional)
  • crs: Coordinate reference system (default: "EPSG:4326")

merge_independent_rasters()

Purpose: Merge corresponding raster files from multiple regions into unified outputs

Key Features:

  • Synchronised processing of multi-region datasets
  • Automatic validation of input file consistency
  • Preserves geospatial metadata and projections
  • Flexible output naming and format configuration
  • Robust error handling for GDAL operations

Parameters:

  • raster_files_dict: Dictionary mapping region names to file lists
  • output_file_format: GDAL driver for output format
  • joint_region_name: Name for combined region in output files
  • output_file_name_ext: Extension for output files
  • nodata_value: NoData value handling (optional)

Advanced Features

Defensive Programming

  • Nested List Support: Automatically flattens complex nested file structures
  • Parameter Validation: Comprehensive input validation with detailed error messages
  • Type Safety: Modern Python type annotations (PEP-604) for better IDE support
  • Error Handling: Detailed RuntimeError and ValueError reporting for debugging

GDAL Integration

  • Professional Workflows: Proper dataset opening, processing, and closing
  • Memory Management: Efficient handling of large raster datasets
  • Format Support: Wide range of raster formats through GDAL drivers
  • Metadata Preservation: Maintains geospatial information during processing

Performance Optimisation

  • Batch Processing: Efficient handling of multiple files
  • Progress Reporting: Real-time feedback during long operations
  • Resource Management: Proper cleanup of GDAL datasets and memory

Supported Formats

Input Formats

  • NetCDF (.nc) - Primary input format for conversion
  • Various raster formats - For merging operations (GeoTIFF, JPEG, PNG, etc.)

Output Formats

  • GeoTIFF (.tif) - Recommended for geospatial data
  • JPEG (.jpg) - For visualisation and web applications
  • PNG (.png) - For high-quality images with transparency
  • And many others - Any format supported by GDAL drivers

Version Information

Current version: 3.3.0

Recent Updates (v3.3.0)

  • Enhanced defensive programming with nested list support
  • Modern PEP-604 type annotations throughout
  • Improved error handling and documentation
  • Variable name standardisation for consistency

For detailed version history, see CHANGELOG.md.

Error Handling

The package provides comprehensive error handling:

  • RuntimeError: For GDAL operation failures (file opening, driver issues, raster creation)
  • ValueError: For parameter validation and input consistency checks
  • TypeError: For incorrect parameter types

Example error scenarios:

# This will raise ValueError if regions have different numbers of files
raster_data = {
    "region1": ["file1.tif", "file2.tif"],
    "region2": ["file1.tif"]  # Inconsistent length
}
merge_independent_rasters(raster_data, "GTiff", "combined", "tif")

System Requirements

  • Python: 3.8 or higher
  • GDAL: System-level installation required (>= 2.0)
  • Operating System: Linux, macOS, Windows (with proper GDAL setup)
  • Memory: Sufficient RAM for processing large raster datasets

Dependencies

Core Dependencies

  • GDAL Python bindings: Essential for all raster operations
  • NumPy: For efficient array operations (indirect dependency)

Internal Dependencies

  • pygenutils: Utility functions and data manipulation
  • paramlib: Parameter and configuration management

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Guidelines

  • Follow existing code structure and GDAL best practices
  • Add comprehensive docstrings with parameter descriptions
  • Include error handling for all GDAL operations
  • Test with various raster formats and coordinate systems
  • Update changelog for significant changes

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • GDAL Development Team for the foundational geospatial data processing library
  • OSGeo Community for open-source geospatial tools and standards
  • Python Geospatial Community for ecosystem development and best practices
  • Climate and Earth Science Communities for driving requirements and use cases

Contact

For any questions or suggestions, please open an issue on GitHub or contact the maintainers.

Troubleshooting

Common Issues

  1. GDAL Import Error:

    # Ensure GDAL is properly installed
    conda install -c conda-forge gdal
    # Or check system installation
    gdalinfo --version
    
  2. Coordinate Reference System Issues:

    • Verify CRS string format (e.g., "EPSG:4326")
    • Check if target CRS is supported by GDAL
  3. Memory Issues with Large Files:

    • Process files in smaller batches
    • Monitor system memory usage during operations
    • Consider using GDAL virtual file systems for very large datasets

Getting Help

  • Check the CHANGELOG.md for recent updates
  • Review function docstrings for parameter details
  • Open an issue on GitHub for bugs or feature requests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geosptools-3.3.0.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geosptools-3.3.0-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file geosptools-3.3.0.tar.gz.

File metadata

  • Download URL: geosptools-3.3.0.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for geosptools-3.3.0.tar.gz
Algorithm Hash digest
SHA256 8fcf8de8c19ae83cec2342e8b54119b8802b9beb2aad32bbb6803e6afa16124a
MD5 1f5eb882f3d28317fe7b5271170539de
BLAKE2b-256 134e47529a1a7667134baa709d08dee1cf8df4972868d3d5204ce88946a61821

See more details on using hashes here.

File details

Details for the file geosptools-3.3.0-py3-none-any.whl.

File metadata

  • Download URL: geosptools-3.3.0-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for geosptools-3.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e3758ee8f4d1b835ff311be4d3728d9071c22a640319613e82239038cc466d8f
MD5 b99e4b34d279e45d3973fe909182b314
BLAKE2b-256 2ff8ac3e49af7d12ad129ec0d35f8920d2ee730f3f1305e0ca1801cadaeb5fc1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page