Skip to main content

Air pollution data extraction and comparison with different sources - HealthyPlanet Project - AWI

Project description

STanalysis

DSS Logo

A Python package for extracting, analyzing, and comparing environmental data from multiple sources.

Developed by the Data Science Support group (DSS) at AWI
For the HealthyPlanet Project – BIPS, under the DataNord initiative


Features

  • Extract point values from NetCDF datasets with spatial and temporal dimensions
  • Support for multiple input formats (CSV, Shapefile, GeoJSON)
  • Temporal aggregation with configurable window sizes
  • Efficient nearest-neighbor interpolation
  • Automatic coordinate transformation when datasets use projected x/y grids
  • Comprehensive error handling and input validation

Installation

You can install STanalysis directly from PyPI:

pip install STanalysis

Or install from source for development:

git clone https://github.com/MuhammadShafeeque/dss-environment-analysis.git
cd dss-environment-analysis
uv pip install -e .[dev]

Quick Start

Here's a simple example of extracting temperature values at specific points:

from STanalysis import extract_point_values

# Extract values from a NetCDF file at specified points
result_df = extract_point_values(
    netcdf_path="temperature.nc",
    points_path="measurement_points.csv",
    variable="temperature",
    days_back=2,  # Average over the last 2 days
    date_col="date"
)

print(result_df[["name", "lat", "lon", "value"]])

Input CSV format example:

name,lat,lon,date
"Bremen City",53.0793,8.8017,2024-06-08
"Bremen North",53.1680,8.6317,2024-06-15

Documentation

Main Functions

extract_point_values

def extract_point_values(
    netcdf_path: str | Path,
    points_path: str | Path,
    variable: str,
    *,
    days_back: int = 7,
    date_col: Optional[str] = None,
    output_path: Optional[str | Path] = None,
) -> pd.DataFrame

Parameters:

  • netcdf_path: Path to the input NetCDF file
  • points_path: Path to point data (CSV, shapefile or GeoJSON)
  • variable: Name of the variable to extract from the NetCDF dataset
  • days_back: Number of days to average backwards from the provided date
  • date_col: Optional column in the point file containing the date
  • output_path: Optional path to write results (CSV or JSON)
  • Spatial coordinates in the input points are assumed to be WGS84 (EPSG:4326) and will be transformed automatically if the NetCDF data uses a different CRS

Returns:

  • DataFrame containing the extracted values and point metadata

Development Setup

Prerequisites

Before you begin, ensure you have the following installed:

Using Dev Container

  1. Clone the Repository:
git clone https://github.com/MuhammadShafeeque/dss-environment-analysis.git
cd dss-environment-analysis
  1. Open in Dev Container:
    • Open VS Code
    • Press Ctrl+Shift+P (or Cmd+Shift+P on macOS)
    • Type "Dev Containers: Open Folder in Container"
    • Select the cloned repository folder

The dev container provides:

  • Python 3.13+ with uv package manager
  • Pre-configured VS Code extensions
  • All development dependencies

Development Tools

The development environment includes:

  • pytest for testing
  • mypy for type checking
  • ruff for code formatting and linting

Run tests:

pytest

Run type checking:

mypy STanalysis

Format code:

ruff format .

Project Structure

dss-environment-analysis/          # Repository root
├── STanalysis/                   # Main package source
│   ├── __init__.py              # Package initialization
│   └── point_extraction.py       # Core functionality
├── examples/                     # Example scripts
│   └── point_extraction_example.py
├── tests/                        # Test files
├── docs/                         # Documentation
├── pyproject.toml               # Project configuration
└── README.md                    # This file

Troubleshooting

NumPy binary compatibility errors

If you see an error like:

ImportError: A module that was compiled using NumPy 1.x cannot be run in NumPy 2.x

one of the compiled dependencies (for example netCDF4, pyproj, or shapely) was built against an older NumPy version. Reinstall the affected packages after upgrading NumPy so that wheels compiled for your NumPy release are used:

pip install --force-reinstall --no-binary :all: netCDF4 pyproj shapely

If compatible wheels are not available, downgrade NumPy to a 1.x release:

pip install "numpy<2"

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is open source and available under the MIT License.


Note: This repository is under active development. Features and APIs may change.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stanalysis-0.1.4.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stanalysis-0.1.4-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file stanalysis-0.1.4.tar.gz.

File metadata

  • Download URL: stanalysis-0.1.4.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for stanalysis-0.1.4.tar.gz
Algorithm Hash digest
SHA256 e23b91c2f3f1d3951744e43c0da7d0019ad576b3eb2c91c5663774efb599085d
MD5 4da745658340ee23930a41a89a2fad81
BLAKE2b-256 f6fd84ed41eecf601e99ea1017d0e315e0d0c84f81994a89c81d5cb22cb477ab

See more details on using hashes here.

File details

Details for the file stanalysis-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: stanalysis-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for stanalysis-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 33c86c025827ee11cbdf680b58bf1311428dcffa669412cd94921f50b40e60bf
MD5 dc9713c1e633c51fd213dcf232d8ced1
BLAKE2b-256 bbeed2108f96c5cf79495acd8a549a2460323cf5b5f995e27268c08959a8fc40

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page