Skip to main content

Air pollution data extraction and comparison with different sources - HealthyPlanet Project - AWI

Project description

STanalysis

DSS Logo

A Python package for extracting, analyzing, and comparing environmental data from multiple sources.

Developed by the Data Science Support group (DSS) at AWI
For the HealthyPlanet Project – BIPS, under the DataNord initiative


Features

  • Extract point values from NetCDF datasets with spatial and temporal dimensions
  • Support for multiple input formats (CSV, Shapefile, GeoJSON)
  • Temporal aggregation with configurable window sizes
  • Efficient nearest-neighbor interpolation
  • Automatic coordinate transformation when datasets use projected x/y grids
  • Comprehensive error handling and input validation

Installation

You can install STanalysis directly from PyPI:

pip install STanalysis

Or install from source for development:

git clone https://github.com/MuhammadShafeeque/dss-environment-analysis.git
cd dss-environment-analysis
uv pip install -e .[dev]

Quick Start

Here's a simple example of extracting temperature values at specific points:

from STanalysis import extract_point_values

# Extract values from a NetCDF file at specified points
result_df = extract_point_values(
    netcdf_path="temperature.nc",
    points_path="measurement_points.csv",
    variable="temperature",
    days_back=2,  # Average over the last 2 days
    date_col="date"
)

print(result_df[["name", "lat", "lon", "value"]])

Input CSV format example:

name,lat,lon,date
"Bremen City",53.0793,8.8017,2024-06-08
"Bremen North",53.1680,8.6317,2024-06-15

Documentation

Main Functions

extract_point_values

def extract_point_values(
    netcdf_path: str | Path,
    points_path: str | Path,
    variable: str,
    *,
    days_back: int = 7,
    date_col: Optional[str] = None,
    output_path: Optional[str | Path] = None,
) -> pd.DataFrame

Parameters:

  • netcdf_path: Path to the input NetCDF file
  • points_path: Path to point data (CSV, shapefile or GeoJSON)
  • variable: Name of the variable to extract from the NetCDF dataset
  • days_back: Number of days to average backwards from the provided date
  • date_col: Optional column in the point file containing the date
  • output_path: Optional path to write results (CSV or JSON)
  • Spatial coordinates in the input points are assumed to be WGS84 (EPSG:4326) and will be transformed automatically if the NetCDF data uses a different CRS

Returns:

  • DataFrame containing the extracted values and point metadata

Development Setup

Prerequisites

Before you begin, ensure you have the following installed:

Using Dev Container

  1. Clone the Repository:
git clone https://github.com/MuhammadShafeeque/dss-environment-analysis.git
cd dss-environment-analysis
  1. Open in Dev Container:
    • Open VS Code
    • Press Ctrl+Shift+P (or Cmd+Shift+P on macOS)
    • Type "Dev Containers: Open Folder in Container"
    • Select the cloned repository folder

The dev container provides:

  • Python 3.13+ with uv package manager
  • Pre-configured VS Code extensions
  • All development dependencies

Development Tools

The development environment includes:

  • pytest for testing
  • mypy for type checking
  • ruff for code formatting and linting

Run tests:

pytest

Run type checking:

mypy STanalysis

Format code:

ruff format .

Project Structure

dss-environment-analysis/          # Repository root
├── STanalysis/                   # Main package source
│   ├── __init__.py              # Package initialization
│   └── point_extraction.py       # Core functionality
├── examples/                     # Example scripts
│   └── point_extraction_example.py
├── tests/                        # Test files
├── docs/                         # Documentation
├── pyproject.toml               # Project configuration
└── README.md                    # This file

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is open source and available under the MIT License.


Note: This repository is under active development. Features and APIs may change.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stanalysis-0.1.3.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stanalysis-0.1.3-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file stanalysis-0.1.3.tar.gz.

File metadata

  • Download URL: stanalysis-0.1.3.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for stanalysis-0.1.3.tar.gz
Algorithm Hash digest
SHA256 8f6e8daa9237cbbd5f43b752662f6e38527904f7efb0de443ae2ab286752c8f0
MD5 06fe9dcdb23b0d11b51afec2b767c395
BLAKE2b-256 8cdd3e8da4f6191ce96a27a485a7a8b80299b1dbf13cf2815d7d0cb7fcf236dc

See more details on using hashes here.

File details

Details for the file stanalysis-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: stanalysis-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for stanalysis-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 76a92640f1914231bc5e12424db87bb6bd13a79872f8ae0d99f06604604c4abc
MD5 a83465b90f113cfa4edb8efd5bd8647b
BLAKE2b-256 871e30871db7c600442b77868b74c71290a69321c11f4dcec3cb0c96a98d5f92

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page