Skip to main content

Interactive CLI toolkit for processing climate data with NetCDF to Zarr conversion and county-level analysis

Project description

๐ŸŒก๏ธ Climate Zarr Toolkit

A powerful, interactive CLI toolkit for processing climate data with guided wizards, smart prompts, and beautiful user experiences. Features cutting-edge NetCDF to Zarr conversion and county-level statistical analysis.

๐Ÿš€ Main Features

  • ๐Ÿ—œ๏ธ NetCDF โ†’ Zarr Conversion: Convert multiple NetCDF files to optimized Zarr format with compression
  • ๐Ÿ“ˆ County Statistics: Calculate detailed climate statistics by county/region with parallel processing
  • ๐Ÿ—บ๏ธ Regional Clipping: Built-in support for US regions (CONUS, Alaska, Hawaii, etc.)
  • ๐ŸŒก๏ธ Multiple Variables: Support for precipitation, temperature, and extreme weather analysis
  • โšก Modern Performance: Leverages Dask, parallel processing, and modern data formats
  • ๐ŸŽจ Beautiful CLI: Rich-powered interface with progress bars and beautiful output

โœจ Interactive Features

  • ๐Ÿง™โ€โ™‚๏ธ Interactive Wizard: Complete guided experience for beginners and experts
  • ๐ŸŽฏ Smart Prompts: Intelligent parameter suggestions with beautiful selection menus
  • โœ… Safety Confirmations: Prevent accidental data loss with confirmation dialogs
  • ๐Ÿ“‚ Smart File Detection: Automatically discovers and suggests data sources
  • ๐Ÿ—บ๏ธ Visual Region Selection: Choose regions with descriptions and coverage details
  • ๐Ÿ”ฌ Variable Picker: Climate variable selection with tooltips and explanations
  • โšก Performance Tuning: Interactive optimization suggestions for your workflow

๐ŸŽฎ Interactive vs Command-Line Modes

This toolkit offers three ways to work with climate data:

๐Ÿง™โ€โ™‚๏ธ Wizard Mode - Best for Beginners

Complete guided experience with step-by-step instructions:

# Launch the interactive wizard
python climate_cli.py wizard

# The wizard will guide you through:
# 1. โœจ Choose your workflow (convert, analyze, or both)
# 2. ๐Ÿ“ Smart file/directory selection
# 3. ๐Ÿ—บ๏ธ Regional clipping with visual descriptions
# 4. ๐Ÿ”ฌ Climate variable selection with explanations
# 5. โš™๏ธ Performance optimization suggestions
# 6. โœ… Safety confirmations before processing
# 7. ๐Ÿ“Š Beautiful results summary

๐ŸŽฏ Interactive Mode - Best for Daily Use

Individual commands with intelligent prompting:

# Interactive NetCDF โ†’ Zarr conversion
python climate_cli.py create-zarr
# Prompts: Select files โ†’ Output name โ†’ Region? โ†’ Compression?

# Interactive county statistics
python climate_cli.py county-stats  
# Prompts: Zarr path โ†’ Region โ†’ Variable โ†’ Threshold โ†’ Output file

โšก Command-Line Mode - Best for Automation

Traditional CLI for scripts and automation:

# Non-interactive mode (disable prompts)
python climate_cli.py create-zarr data/ -o output.zarr --region conus --interactive false
python climate_cli.py county-stats data.zarr conus -v pr -t 25.4 --interactive false

๐Ÿ“ฆ Data Requirements

Important: This toolkit requires you to provide your own data files. The repository does not include large data files to keep it lightweight and fast to clone.

Required Data Files

  1. ๐ŸŒก๏ธ Climate Data: NetCDF files with climate variables (precipitation, temperature, etc.)

  2. ๐Ÿ—บ๏ธ US County Boundaries: Census TIGER/Line county shapefiles

Quick Data Setup

# 1. Place your NetCDF files in the data directory
mkdir -p data/
# Copy your .nc files to data/

# 2. Download and prepare county shapefiles (see utils/README.md for details)
cd utils/
# Follow instructions in utils/README.md to download and split shapefiles
python split_counties_by_region.py

# 3. You're ready to go!
cd ..
python climate_cli.py wizard

๐Ÿš€ Quick Start

Installation

# Clone the repository
git clone https://github.com/mihiarc/climate-zarr
cd climate-zarr

# Install dependencies (using uv - the modern Python package manager)
uv install

# Or install in editable mode
uv pip install -e .

Get Started in 30 Seconds

# ๐Ÿง™โ€โ™‚๏ธ NEW! Start with the interactive wizard (recommended)
python climate_cli.py wizard

# Or explore individual commands
python climate_cli.py --help          # See all commands
python climate_cli.py info            # Check your data
python climate_cli.py list-regions    # See available regions

# ๐ŸŽฎ Try interactive mode
python climate_cli.py create-zarr     # Interactive conversion
python climate_cli.py county-stats    # Interactive analysis

๐Ÿ“– Commands Reference

๐Ÿง™โ€โ™‚๏ธ Wizard Mode - Complete Guided Experience

python climate_cli.py wizard
# or
python climate_cli.py interactive

Perfect for:

  • ๐ŸŽ“ Learning the toolkit
  • ๐Ÿ”„ Complete end-to-end workflows
  • ๏ฟฝ๏ฟฝ Complex multi-step analyses
  • ๐Ÿง  Understanding best practices

๐Ÿ—œ๏ธ Create Zarr from NetCDF

Interactive Mode (Recommended):

python climate_cli.py create-zarr
# The CLI will guide you through:
# - ๐Ÿ“‚ File/directory selection
# - ๐Ÿ“ Output naming
# - ๐Ÿ—บ๏ธ Regional clipping options
# - ๐Ÿ—œ๏ธ Compression settings

Command-Line Mode:

# Basic conversion
python climate_cli.py create-zarr data/ -o precipitation.zarr

# Convert with region clipping (CONUS only)
python climate_cli.py create-zarr data/ -o conus_precip.zarr --region conus

# Custom chunking and compression
python climate_cli.py create-zarr data/ \
    -o optimized.zarr \
    --chunks "time=365,lat=180,lon=360" \
    --compression zstd \
    --compression-level 7

# Non-interactive mode for scripts
python climate_cli.py create-zarr data/ -o output.zarr --interactive false

Options:

  • --output, -o: Output Zarr store path
  • --region, -r: Clip to specific region (conus, alaska, hawaii, etc.)
  • --concat-dim, -d: Dimension to concatenate along (default: time)
  • --chunks, -c: Custom chunk sizes
  • --compression: Algorithm (default, zstd, zlib, gzip)
  • --compression-level: Level 1-9 (default: 5)
  • --interactive, -i: Enable/disable interactive prompts (default: true)

๐Ÿ“ˆ Calculate County Statistics

Interactive Mode (Recommended):

python climate_cli.py county-stats
# The CLI will guide you through:
# - ๐Ÿ“ Zarr dataset selection  
# - ๐Ÿ—บ๏ธ Region selection with descriptions
# - ๐Ÿ”ฌ Climate variable picker with tooltips
# - ๐ŸŽฏ Threshold configuration
# - โšก Performance settings

Command-Line Mode:

# Basic precipitation analysis for CONUS
python climate_cli.py county-stats precipitation.zarr conus -v pr -t 25.4

# Temperature analysis for Alaska with more workers
python climate_cli.py county-stats temperature.zarr alaska \
    -v tas \
    --workers 8 \
    -o alaska_temp_stats.csv

# Extreme heat analysis for Hawaii
python climate_cli.py county-stats extremes.zarr hawaii \
    -v tasmax \
    -t 90 \
    --scenario future

# Using distributed processing
python climate_cli.py county-stats large_dataset.zarr conus \
    -v pr \
    --distributed \
    --workers 16

# Non-interactive mode for scripts
python climate_cli.py county-stats data.zarr conus -v pr -t 25.4 --interactive false

Options:

  • --output, -o: Output CSV file
  • --variable, -v: Climate variable (pr, tas, tasmax, tasmin)
  • --scenario, -s: Scenario name (default: historical)
  • --threshold, -t: Threshold value for analysis
  • --workers, -w: Number of worker processes
  • --distributed: Use Dask distributed processing
  • --chunk-counties: Process counties in chunks (default: True)
  • --interactive, -i: Enable/disable interactive prompts (default: true)

๐Ÿ—บ๏ธ Available Regions

The toolkit supports these predefined regions:

Region Name Coverage
conus Continental US 24.0ยฐN to 50.0ยฐN, -125.0ยฐE to -66.0ยฐE
alaska Alaska 54.0ยฐN to 72.0ยฐN, -180.0ยฐE to -129.0ยฐE
hawaii Hawaii 18.0ยฐN to 23.0ยฐN, -162.0ยฐE to -154.0ยฐE
guam Guam/MP 13.0ยฐN to 21.0ยฐN, 144.0ยฐE to 146.0ยฐE
puerto_rico Puerto Rico/USVI 17.5ยฐN to 18.6ยฐN, -67.5ยฐE to -64.5ยฐE
global Global Full global coverage

๐Ÿ”ฌ Supported Climate Variables

Variable Description Units Statistics Generated
pr Precipitation mm/day Total annual, days >25.4mm, mean daily, max daily, dry days
tas Air Temperature ยฐC Mean annual, min/max, range, std dev, freezing days, hot days
tasmax Daily Maximum Temperature ยฐC Mean annual max, extremes, hot days above threshold
tasmin Daily Minimum Temperature ยฐC Mean annual min, cold days, frost-free period

๐Ÿ“ Project Structure

climate-zarr/
โ”œโ”€โ”€ climate_cli.py              # ๐ŸŽฏ Interactive CLI tool (NEW!)
โ”œโ”€โ”€ stack_nc_to_zarr.py         # NetCDF โ†’ Zarr conversion
โ”œโ”€โ”€ calculate_county_stats.py   # County statistics processor
โ”œโ”€โ”€ climate_config.py           # Configuration management
โ”œโ”€โ”€ demo_cli.py                 # Interactive demo script
โ”œโ”€โ”€ utils/
โ”‚   โ”œโ”€โ”€ split_counties_by_region.py  # County shapefile splitter
โ”‚   โ””โ”€โ”€ README.md               # Data preparation instructions
โ”œโ”€โ”€ data/                       # ๐Ÿ“ NetCDF input files (user-provided)
โ”œโ”€โ”€ regional_counties/          # ๐Ÿ—บ๏ธ County shapefiles by region (user-generated)
โ””โ”€โ”€ pyproject.toml             # Project dependencies

Note: data/ and regional_counties/ directories are not included in the repository. Users must:

  1. Add their own NetCDF climate data to data/
  2. Follow utils/README.md to download and prepare county shapefiles

๐ŸŽฏ Usage Examples

๐Ÿง™โ€โ™‚๏ธ Complete Workflow with Wizard (Recommended for beginners)

# Start the interactive wizard
python climate_cli.py wizard

# Follow the guided prompts:
# 1. "What would you like to do?" โ†’ Full pipeline
# 2. "Select your data source" โ†’ Choose data/ directory  
# 3. "Configure Zarr conversion" โ†’ CONUS region, ZSTD compression
# 4. "Configure county statistics" โ†’ Precipitation, 25.4mm threshold
# 5. Review settings and confirm
# 6. Watch beautiful progress bars and get comprehensive results!

๐ŸŽฏ Interactive Command Workflow (Daily use)

# 1. Check available data (always good to start here)
python climate_cli.py info

# 2. Interactive NetCDF โ†’ Zarr conversion
python climate_cli.py create-zarr
# Follow prompts: data/ โ†’ precipitation.zarr โ†’ CONUS โ†’ ZSTD

# 3. Interactive county statistics
python climate_cli.py county-stats  
# Follow prompts: precipitation.zarr โ†’ CONUS โ†’ pr โ†’ 25.4 โ†’ results.csv

โšก Command-Line Workflow (Automation & scripts)

# 1. Check available data and regions
python climate_cli.py info
python climate_cli.py list-regions

# 2. Convert NetCDF to Zarr for CONUS region
python climate_cli.py create-zarr data/ \
    -o conus_precipitation.zarr \
    --region conus \
    --compression zstd \
    --interactive false

# 3. Calculate county precipitation statistics
python climate_cli.py county-stats conus_precipitation.zarr conus \
    -v pr \
    -t 25.4 \
    -o conus_precip_stats.csv \
    --workers 8 \
    --interactive false

# 4. Analyze temperature extremes for different regions
python climate_cli.py county-stats temperature.zarr alaska \
    -v tasmin \
    -o alaska_cold_stats.csv \
    --interactive false

python climate_cli.py county-stats temperature.zarr hawaii \
    -v tasmax \
    -t 32 \
    -o hawaii_heat_stats.csv \
    --interactive false

๐ŸŽฎ Mixed Interactive & Command-Line

# Use interactive mode for complex decisions, CLI for known parameters
python climate_cli.py create-zarr          # Interactive file/region selection
python climate_cli.py county-stats data.zarr conus -v pr  # Known dataset, interactive for other params

๐Ÿ› ๏ธ Technical Details

Modern Interactive Stack (2025)

  • CLI Framework: Typer with Rich integration for beautiful output
  • Interactive Prompts: Questionary for beautiful selection menus and confirmations
  • Data Processing: xarray, dask, zarr (v3 ready)
  • Geospatial: geopandas, rioxarray, pyogrio
  • Performance: Parallel processing, chunked operations
  • Visualization: Rich progress bars, tables, panels

Interactive Features

  • Smart File Detection: Automatically scans common directories (data/, input/, netcdf/)
  • Contextual Suggestions: Intelligent defaults based on your data and previous choices
  • Error Recovery: When something goes wrong, get interactive suggestions to fix it
  • Safety First: Confirmation dialogs before potentially long-running or destructive operations
  • Progress Tracking: Beautiful progress bars and real-time status updates

Performance Features

  • Intelligent Chunking: Automatically optimized for your data
  • Parallel Processing: Multiprocessing + optional Dask distributed
  • Memory Efficient: Chunked county processing for large datasets
  • Compression: Multiple algorithms (zstd, zlib, gzip) with tunable levels

Data Formats

  • Input: NetCDF (.nc) files with CF conventions
  • Output: Zarr v2/v3 stores, CSV statistics
  • Coordinates: Automatic handling of different longitude conventions

๐ŸŽ“ Interactive Learning Mode

For New Users:

  1. Start with: python climate_cli.py wizard
  2. Learn basics: Follow the guided tour and explanations
  3. Practice: Try interactive commands with python climate_cli.py create-zarr
  4. Advanced: Move to command-line mode for automation

For Experienced Users:

  1. Quick setup: python climate_cli.py create-zarr data/ -o output.zarr --region conus
  2. Interactive help: Use prompts when you need parameter suggestions
  3. Automation: Use --interactive false for scripts and CI/CD

For Developers:

  1. Study the code: Modern CLI patterns with Typer + Rich + Questionary
  2. Extend features: Add new interactive prompts and wizard steps
  3. Learn patterns: Type hints, async processing, configuration management

๐ŸŽฎ Demo & Testing

# ๐ŸŽฌ Run the comprehensive interactive demo
python demo_cli.py

# ๐Ÿงช Test individual features
python climate_cli.py wizard           # Full wizard experience
python climate_cli.py create-zarr      # Interactive conversion
python climate_cli.py county-stats     # Interactive analysis
python climate_cli.py info             # System overview
python climate_cli.py --help           # See all commands

๐Ÿค Contributing

This is a modern, educational toolkit showcasing 2025 best practices in interactive CLI development. Key patterns demonstrated:

  • ๐ŸŽจ Modern CLI Design: Typer + Rich + Questionary for beautiful UX
  • ๐ŸŽฏ Interactive UX Patterns: Beautiful prompts, confirmations, selections
  • โšก Performance Optimization: Chunking, compression, parallel processing
  • ๐ŸŽฏ CLI Design Excellence: User-friendly command interfaces with rich feedback
  • ๐Ÿ”„ Data Engineering: Efficient conversion and processing pipelines
  • ๐ŸŒŸ Open Source Integration: Leveraging the latest ecosystem tools

๐ŸŽฌ Start your journey: python climate_cli.py wizard


Built with modern 2025 tools: Python 3.10+, Typer, Rich, Questionary, xarray, Zarr, Dask, and more!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

climate_zarr-0.1.0.tar.gz (51.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

climate_zarr-0.1.0-py3-none-any.whl (32.2 kB view details)

Uploaded Python 3

File details

Details for the file climate_zarr-0.1.0.tar.gz.

File metadata

  • Download URL: climate_zarr-0.1.0.tar.gz
  • Upload date:
  • Size: 51.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for climate_zarr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d9eb3ff5ca5fae1d3baa143b26e6496583da7130ecd16f52c3c89f563679dc22
MD5 852cd3099f0fbc04b0a3c70f057974bc
BLAKE2b-256 d9f9f3941d5eafb4c0fd8359c9c3720b74ce1b0665c7f6287b5458e73378d26d

See more details on using hashes here.

File details

Details for the file climate_zarr-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: climate_zarr-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 32.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for climate_zarr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 de579f3abeeda9a433117880cc6be99cda7b27577058cb5b20f654c10d9ee27d
MD5 d14ccdb2746a3fe8cfdb3e0af8c43593
BLAKE2b-256 f7b11f1da32ea8c8a070dabe34fa52d94e441e17a603815ad0fd2a35284d3f4f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page