Skip to main content

Air pollution data extraction and comparison with different sources - HealthyPlanet Project - AWI

Project description

STanalysis

DSS Logo

A Python package for extracting, analyzing, and comparing environmental data from multiple sources.

Developed by the Data Science Support group (DSS) at AWI
For the HealthyPlanet Project – BIPS, under the DataNord initiative


Features

  • Extract point values from NetCDF datasets with spatial and temporal dimensions
  • Support for multiple input formats (CSV, Shapefile, GeoJSON)
  • Temporal aggregation with configurable window sizes
  • Efficient nearest-neighbor interpolation
  • Automatic coordinate transformation when datasets use projected x/y grids
  • Comprehensive error handling and input validation

Installation

You can install STanalysis directly from PyPI:

pip install STanalysis

Or install from source for development:

git clone https://github.com/MuhammadShafeeque/dss-environment-analysis.git
cd dss-environment-analysis
uv pip install -e .[dev]

Quick Start

Here's a simple example of extracting temperature values at specific points:

from STanalysis import extract_point_values

# Extract values from a NetCDF file at specified points
result_df = extract_point_values(
    netcdf_path="temperature.nc",
    points_path="measurement_points.csv",
    variable="temperature",
    days_back=2,  # Average over the last 2 days
    date_col="date"
)

print(result_df[["name", "lat", "lon", "value"]])

Input CSV format example:

name,lat,lon,date
"Bremen City",53.0793,8.8017,2024-06-08
"Bremen North",53.1680,8.6317,2024-06-15

Documentation

Main Functions

extract_point_values

def extract_point_values(
    netcdf_path: str | Path,
    points_path: str | Path,
    variable: str,
    *,
    days_back: int = 7,
    date_col: Optional[str] = None,
    output_path: Optional[str | Path] = None,
) -> pd.DataFrame

Parameters:

  • netcdf_path: Path to the input NetCDF file
  • points_path: Path to point data (CSV, shapefile or GeoJSON)
  • variable: Name of the variable to extract from the NetCDF dataset
  • days_back: Number of days to average backwards from the provided date
  • date_col: Optional column in the point file containing the date
  • output_path: Optional path to write results (CSV or JSON)
  • Spatial coordinates in the input points are assumed to be WGS84 (EPSG:4326) and will be transformed automatically if the NetCDF data uses a different CRS

Returns:

  • DataFrame containing the extracted values and point metadata

Development Setup

Prerequisites

Before you begin, ensure you have the following installed:

Using Dev Container

  1. Clone the Repository:
git clone https://github.com/MuhammadShafeeque/dss-environment-analysis.git
cd dss-environment-analysis
  1. Open in Dev Container:
    • Open VS Code
    • Press Ctrl+Shift+P (or Cmd+Shift+P on macOS)
    • Type "Dev Containers: Open Folder in Container"
    • Select the cloned repository folder

The dev container provides:

  • Python 3.13+ with uv package manager
  • Pre-configured VS Code extensions
  • All development dependencies

Development Tools

The development environment includes:

  • pytest for testing
  • mypy for type checking
  • ruff for code formatting and linting

Run tests:

pytest

Run type checking:

mypy STanalysis

Format code:

ruff format .

Project Structure

dss-environment-analysis/          # Repository root
├── STanalysis/                   # Main package source
│   ├── __init__.py              # Package initialization
│   └── point_extraction.py       # Core functionality
├── examples/                     # Example scripts
│   └── point_extraction_example.py
├── tests/                        # Test files
├── docs/                         # Documentation
├── pyproject.toml               # Project configuration
└── README.md                    # This file

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is open source and available under the MIT License.


Note: This repository is under active development. Features and APIs may change.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stanalysis-0.1.2.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stanalysis-0.1.2-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file stanalysis-0.1.2.tar.gz.

File metadata

  • Download URL: stanalysis-0.1.2.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for stanalysis-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ee305d6ebac713cafcf002ba0ae64aead4e4075b861af353d866cc0c5cdfbbab
MD5 fafa75e51be751519cabec72024046c6
BLAKE2b-256 55087f10d248dfd1ba46bb51ac8dd2915cf0797ea9e2daba5eda9ddce1aeb188

See more details on using hashes here.

File details

Details for the file stanalysis-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: stanalysis-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 5.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for stanalysis-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7bf084c784af3e7ba81ab81a76f16c33cc210220e05cbd2908102059ba162ee6
MD5 4e2ec5cf94ba8309f5b5e586360182da
BLAKE2b-256 f020ff54d9c27f0d0c23590a9b98064960457d93848b0e18739c72996fabe582

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page