Weather forecast analysis with ensemble probability and population risk assessment
Project description
PIPECAST Weather Forecast Analysis
Pipeline Integrated Prediction & Environmental Climate Analysis using Satellite Tracking
A Python library for weather forecast analysis, Areas of Interest (AOI) generation, ensemble probability products, and population risk assessment.
Features
๐ฆ๏ธ Multi-Dataset Support
- HRRR (Continental US)
- HRRR Alaska
- ECMWF (planned)
- GFS (planned)
- Extensible to other datasets
๐ AOI Generation
- Automatic identification of precipitation areas
- Multiple threshold levels
- Connected region detection
- Land clipping
๐ Enhanced Analysis
- Census population data integration
- Watershed analysis
- Custom layer support
- Risk ranking
๐ฒ Ensemble Products
- Probabilistic forecast generation
- Multi-member aggregation
- GeoTIFF probability maps
- Ranked risk assessment
๐ Visualization
- Grid plots of AOIs
- Interactive Folium maps
- Threshold comparisons
- Time series analysis
Installation
From PyPI (Recommended)
pip install pipecast-weather
From Source (Development)
git clone https://github.com/NASA-EarthRISE/PIPECAST.git
cd PIPECAST
pip install -e .
With Visualization Support
pip install pipecast-weather[viz]
Google Colab
!pip install git+https://github.com/NASA-EarthRISE/PIPECAST.git
Quick Start
Basic Usage
from pipecast import ForecastConfig, ForecastProcessor
# Configure
config = ForecastConfig(
forecast_dates=["2025-10-07", "2025-10-08", "2025-10-09"],
fxx_list=[0, 12, 24],
thresholds=[39, 100, 255],
weather_dataset="hrrr",
output_dir="./output"
)
# Process
processor = ForecastProcessor(config)
results = processor.process_all_forecasts()
Alaska HRRR
from pipecast.config import PresetConfigs
# Use preset configuration
config = PresetConfigs.alaska_hrrr(
forecast_dates=["2025-10-07", "2025-10-08"],
output_dir="./alaska_output"
)
processor = ForecastProcessor(config)
results = processor.process_all_forecasts()
With Enhanced Layers (Census + Watershed)
config = ForecastConfig(
forecast_dates=["2025-10-07"],
forecast_methods=["standard", "enhanced"],
use_census=True,
use_watershed=True,
output_dir="./enhanced_output"
)
processor = ForecastProcessor(config)
results = processor.process_all_forecasts()
Creating Ensemble Products
from pipecast import EnsembleProcessor
# Create ensemble probability maps
processor = EnsembleProcessor("./output")
prob_paths = processor.create_ensemble_probabilities()
# Rank AOIs by risk
ranked = processor.rank_aois_by_probability(
census_gdf=census_data, # optional
top_n=50
)
print(ranked.head(10))
Visualization
from pipecast.visualization import visualize_forecast_outputs
# Create all standard visualizations
visualize_forecast_outputs("./output")
# Or use the visualizer class for more control
from pipecast.visualization import ForecastVisualizer
viz = ForecastVisualizer("./output")
viz.plot_threshold_comparison("2025-10-07", fxx=12)
viz.create_all_date_maps()
Configuration Options
Complete Configuration Example
from pipecast import ForecastConfig, WeatherDataset
config = ForecastConfig(
# Dates and times
forecast_dates=["2025-10-07", "2025-10-08", "2025-10-09", "2025-10-10"],
fxx_list=[0, 4, 8, 12, 16, 20, 24],
# Weather dataset
weather_dataset=WeatherDataset.HRRR,
product="sfc",
variable="APCP:surface",
variable_name="tp",
# Thresholds
thresholds=[5, 39, 50, 100, 254, 255],
# Processing
forecast_methods=["standard", "enhanced"],
target_crs="EPSG:4326",
min_aoi_area=0.01,
clip_to_land=True,
# Enhanced layers
use_census=True,
use_watershed=True,
custom_layers={
"roads": "/path/to/roads.shp",
"infrastructure": "/path/to/infrastructure.geojson"
},
# Ensemble
threshold_bins=[(0, 5), (6, 39), (40, 50), (51, 100), (100, 254), (255, float('inf'))],
bin_labels=["0-5", "6-39", "40-50", "51-100", "100-254", "255+"],
ensemble_resolution=0.05,
# Output
output_dir="./output",
save_aois=True,
save_ensemble=True,
save_visualizations=True
)
Custom Threshold Bins
# Define custom bins for your warning levels
config = ForecastConfig(
forecast_dates=["2025-10-07"],
threshold_bins=[
(0, 10), # Light
(10, 25), # Moderate
(25, 50), # Heavy
(50, 100), # Very Heavy
(100, float('inf')) # Extreme
],
bin_labels=["Light", "Moderate", "Heavy", "Very Heavy", "Extreme"],
output_dir="./custom_bins"
)
Working with Custom Layers
# Add your own enhanced layers
config = ForecastConfig(
forecast_dates=["2025-10-07"],
forecast_methods=["enhanced"],
use_census=True,
use_watershed=True,
custom_layers={
"pipelines": "/path/to/pipeline_network.shp",
"critical_infrastructure": "/path/to/infrastructure.geojson",
"evacuation_zones": "/path/to/zones.shp"
},
output_dir="./custom_layers"
)
processor = ForecastProcessor(config)
results = processor.process_all_forecasts()
# Results will include stats for each custom layer
for date in results['enhanced']:
for key, stats in results['enhanced'][date].items():
print(f"{key}: {stats.get('pipelines_features', 0)} pipeline features affected")
AOI Filtering by Region
# Use a shapefile to constrain analysis to a specific region
config = ForecastConfig(
forecast_dates=["2025-10-07"],
aoi_shapefile="/path/to/alabama.shp", # Only analyze Alabama
output_dir="./alabama_only"
)
Output Structure
output/
โโโ standard/
โ โโโ 2025-10-07/
โ โ โโโ F0_T39_aois.geojson
โ โ โโโ F0_T100_aois.geojson
โ โ โโโ F12_T39_aois.geojson
โ โ โโโ ...
โ โโโ 2025-10-08/
โ โ โโโ ...
โ โโโ ...
โโโ enhanced/
โ โโโ 2025-10-07/
โ โ โโโ ...
โ โโโ ...
โโโ ensemble_probability/
โ โโโ probability_0-5.tif
โ โโโ probability_6-39.tif
โ โโโ probability_40-50.tif
โ โโโ probability_51-100.tif
โ โโโ probability_100-254.tif
โ โโโ probability_255plus.tif
โ โโโ ranked_aois.csv
โ โโโ ensemble_manifest.json
โโโ visualizations/
โ โโโ aoi_grid_batch_1.png
โ โโโ map_2025-10-07.html
โ โโโ ...
โโโ experiment_summary.json
Google Colab Example
# Mount Drive
from google.colab import drive
drive.mount('/content/drive')
# Install
!pip install git+https://github.com/NASA-EarthRISE/PIPECAST.git
!pip install herbie-data --quiet
!pip install rasterio
# Run
from pipecast import ForecastConfig, ForecastProcessor
config = ForecastConfig(
forecast_dates=["2025-10-07", "2025-10-08"],
fxx_list=[0, 12, 24],
thresholds=[39, 100, 255],
weather_dataset="hrrr",
use_census=True,
use_watershed=True,
output_dir="/content/drive/MyDrive/pipecast_output"
)
processor = ForecastProcessor(config)
results = processor.process_all_forecasts()
# Create ensemble products
from pipecast import EnsembleProcessor
ensemble = EnsembleProcessor("/content/drive/MyDrive/pipecast_output")
ensemble.create_ensemble_probabilities()
# Rank by risk
ranked = ensemble.rank_aois_by_probability(
census_gdf=processor.census_gdf,
top_n=50
)
API Reference
ForecastConfig
Configuration class for forecast processing.
Key Parameters:
forecast_dates: List of dates (YYYY-MM-DD)fxx_list: Forecast hours to evaluatethresholds: Precipitation thresholds in mmweather_dataset: Dataset to use (HRRR, HRRRAK, etc.)forecast_methods: ["standard", "enhanced"]use_census: Include census population datause_watershed: Include watershed datacustom_layers: Dict of custom layersclip_to_land: Remove ocean areasoutput_dir: Output directory
ForecastProcessor
Main processing engine.
Methods:
process_all_forecasts(): Process all configured forecastsprocess_single_forecast(date, fxx, threshold, method): Process one forecastgenerate_aois(precip_data, ds, threshold): Generate AOIs from dataenhance_aois(gdf_aoi): Add census/watershed statistics
EnsembleProcessor
Ensemble probability generator.
Methods:
create_ensemble_probabilities(): Create probability GeoTIFFsrank_aois_by_probability(census_gdf, top_n): Rank AOIs by riskcollect_members(): Gather all AOI files
DataManager
Enhanced layer management.
Methods:
download_census_data(url): Download/load censusdownload_watershed_data(url): Download/load watershedload_custom_layer(name, filepath): Load custom layerclip_to_land(gdf, boundary): Clip to land areas
Preset Configurations
from pipecast.config import PresetConfigs
# Alaska
config = PresetConfigs.alaska_hrrr(dates, output_dir)
# Continental US
config = PresetConfigs.conus_hrrr(dates, output_dir)
# Quick test
config = PresetConfigs.quick_test(date, output_dir)
Workflow Example: Complete Pipeline
from pipecast import ForecastConfig, ForecastProcessor, EnsembleProcessor
from pipecast.visualization import visualize_forecast_outputs
# Step 1: Configure
config = ForecastConfig(
forecast_dates=["2025-10-07", "2025-10-08", "2025-10-09", "2025-10-10"],
fxx_list=[0, 4, 8, 12, 16, 20, 24],
thresholds=[5, 39, 50, 100, 254, 255],
forecast_methods=["standard", "enhanced"],
use_census=True,
use_watershed=True,
clip_to_land=True,
output_dir="./complete_run"
)
# Step 2: Process forecasts
print("Processing forecasts...")
processor = ForecastProcessor(config)
results = processor.process_all_forecasts()
# Step 3: Create ensemble products
print("\nCreating ensemble products...")
ensemble = EnsembleProcessor("./complete_run")
prob_paths = ensemble.create_ensemble_probabilities()
# Step 4: Rank by risk
print("\nRanking AOIs by risk...")
ranked = ensemble.rank_aois_by_probability(
census_gdf=processor.census_gdf,
top_n=100
)
# Step 5: Visualize
print("\nCreating visualizations...")
visualize_forecast_outputs("./complete_run")
print("\nโ
Complete pipeline finished!")
print(f"Results in: ./complete_run")
print(f"Top 10 highest risk AOIs:")
print(ranked[['bin', 'ensemble_count', 'mean_precip_mm', 'population_affected']].head(10))
Troubleshooting
Issue: Herbie can't find data
- Check your date is within HRRR availability
- Verify internet connection
- Try different fxx values
Issue: Out of memory
- Process fewer dates at once
- Reduce number of thresholds
- Increase ensemble grid resolution
Issue: Census/watershed download fails
- Check Zenodo URLs
- Verify network connection
- Use local files instead
Issue: No AOIs generated
- Check thresholds are appropriate for data
- Verify weather data was fetched correctly
- Try lower threshold values
Performance Tips
- Parallel Processing: Process dates in parallel (future feature)
- Caching: Enhanced layers are cached locally
- Grid Resolution: Increase resolution_deg for faster ensemble
- Selective Processing: Use specific date/threshold lists
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests
- Submit a pull request
Citation
If you use PIPECAST in your research:
[Citation information to be added]
License
This project is licensed under the MIT License - see the LICENSE file for details.
TL;DR: Use it, modify it, share it - just keep the copyright notice! ๐
Contact
- GitHub Issues: Report issues
- Documentation: Full docs (coming soon)
Acknowledgments
- NASA-EarthRISE initiative
- NOAA HRRR dataset
- Herbie weather data library
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pipecast_weather-0.2.0.tar.gz.
File metadata
- Download URL: pipecast_weather-0.2.0.tar.gz
- Upload date:
- Size: 32.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a774ddcc6f44b605bb8beb987693207d59a2065354d0efe46efaaf5184cd23f1
|
|
| MD5 |
6a8724247356a1d50b5bea96a024c878
|
|
| BLAKE2b-256 |
a71cbe47979d454e5643d78cb70089b7032a2c88fb1df7ac4ee91c50ea77795c
|
File details
Details for the file pipecast_weather-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pipecast_weather-0.2.0-py3-none-any.whl
- Upload date:
- Size: 26.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9026570916aef53c54b85feadf57f868d72e168996cb91a05a8a64e6db2dbb4
|
|
| MD5 |
96190edc247202c900d32bd465380588
|
|
| BLAKE2b-256 |
5fb0fa9877ce1a9a69bac13a2a883049d4765bc2a32217ac8a9f90c106dee67e
|