Air pollution data extraction and comparison with different sources - HealthyPlanet Project - AWI
Project description
STanalysis
A Python package for extracting, analyzing, and comparing environmental data from multiple sources.
Developed by the Data Science Support group (DSS) at AWI
For the HealthyPlanet Project – BIPS, under the DataNord initiative
Features
- Extract point values from NetCDF datasets with spatial and temporal dimensions
- Support for multiple input formats (CSV, Shapefile, GeoJSON)
- Temporal aggregation with configurable window sizes
- Efficient nearest-neighbor interpolation
- Automatic coordinate transformation when datasets use projected x/y grids
- Comprehensive error handling and input validation
Installation
You can install STanalysis directly from PyPI:
pip install STanalysis
Or install from source for development:
git clone https://github.com/MuhammadShafeeque/dss-environment-analysis.git
cd dss-environment-analysis
uv pip install -e .[dev]
Quick Start
Here's a simple example of extracting temperature values at specific points:
from STanalysis import extract_point_values
# Extract values from a NetCDF file at specified points
result_df = extract_point_values(
netcdf_path="temperature.nc",
points_path="measurement_points.csv",
variable="temperature",
days_back=2, # Average over the last 2 days
date_col="date"
)
print(result_df[["name", "lat", "lon", "value"]])
Input CSV format example:
name,lat,lon,date
"Bremen City",53.0793,8.8017,2024-06-08
"Bremen North",53.1680,8.6317,2024-06-15
Documentation
Main Functions
extract_point_values
def extract_point_values(
netcdf_path: str | Path,
points_path: str | Path,
variable: str,
*,
days_back: int = 7,
date_col: Optional[str] = None,
output_path: Optional[str | Path] = None,
) -> pd.DataFrame
Parameters:
netcdf_path: Path to the input NetCDF filepoints_path: Path to point data (CSV, shapefile or GeoJSON)variable: Name of the variable to extract from the NetCDF datasetdays_back: Number of days to average backwards from the provided datedate_col: Optional column in the point file containing the dateoutput_path: Optional path to write results (CSV or JSON)- Spatial coordinates in the input points are assumed to be WGS84 (EPSG:4326) and will be transformed automatically if the NetCDF data uses a different CRS
Returns:
- DataFrame containing the extracted values and point metadata
Development Setup
Prerequisites
Before you begin, ensure you have the following installed:
- Docker
- Visual Studio Code
- Dev Containers extension for VS Code
Using Dev Container
- Clone the Repository:
git clone https://github.com/MuhammadShafeeque/dss-environment-analysis.git
cd dss-environment-analysis
- Open in Dev Container:
- Open VS Code
- Press
Ctrl+Shift+P(orCmd+Shift+Pon macOS) - Type "Dev Containers: Open Folder in Container"
- Select the cloned repository folder
The dev container provides:
- Python 3.13+ with
uvpackage manager - Pre-configured VS Code extensions
- All development dependencies
Development Tools
The development environment includes:
- pytest for testing
- mypy for type checking
- ruff for code formatting and linting
Run tests:
pytest
Run type checking:
mypy STanalysis
Format code:
ruff format .
Project Structure
dss-environment-analysis/ # Repository root
├── STanalysis/ # Main package source
│ ├── __init__.py # Package initialization
│ └── point_extraction.py # Core functionality
├── examples/ # Example scripts
│ └── point_extraction_example.py
├── tests/ # Test files
├── docs/ # Documentation
├── pyproject.toml # Project configuration
└── README.md # This file
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is open source and available under the MIT License.
Note: This repository is under active development. Features and APIs may change.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stanalysis-0.1.2.tar.gz.
File metadata
- Download URL: stanalysis-0.1.2.tar.gz
- Upload date:
- Size: 6.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee305d6ebac713cafcf002ba0ae64aead4e4075b861af353d866cc0c5cdfbbab
|
|
| MD5 |
fafa75e51be751519cabec72024046c6
|
|
| BLAKE2b-256 |
55087f10d248dfd1ba46bb51ac8dd2915cf0797ea9e2daba5eda9ddce1aeb188
|
File details
Details for the file stanalysis-0.1.2-py3-none-any.whl.
File metadata
- Download URL: stanalysis-0.1.2-py3-none-any.whl
- Upload date:
- Size: 5.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bf084c784af3e7ba81ab81a76f16c33cc210220e05cbd2908102059ba162ee6
|
|
| MD5 |
4e2ec5cf94ba8309f5b5e586360182da
|
|
| BLAKE2b-256 |
f020ff54d9c27f0d0c23590a9b98064960457d93848b0e18739c72996fabe582
|