Skip to main content

Tools for processing cubed-sphere data.

Project description

m21ctools

m21ctools is a Python library designed to handle cubed-sphere data efficiently. It provides tools for reading, processing, interpolating, and visualizing data from cubed-sphere NetCDF-4 files.

Key Features

Data Loading and Cleaning

  • Reading NetCDF Files:
    Easily read from NetCDF-4 files using the xarray library with the h5netcdf engine.
  • Handling Duplicate Dimensions:
    Automatically resolves issues with duplicate 'ncontact' dimension names by replacing them with unique names, ensuring the dataset is ready for analysis.

Longitude Adjustment

  • Standardizing Coordinates:
    Automatically adjusts longitudes to fall within the standard range of -180° to 180°.

Data Aggregation

  • Combining Data Faces:
    Aggregates data from the six faces of the cubed-sphere into flat lists, which simplifies further analysis and processing.

Interpolation to Regular Grid

  • Grid Interpolation:
    Interpolates irregular cubed-sphere data onto a regular latitude-longitude grid using interpolation methods from SciPy.

Visualization

  • Plotting Tools:
    Visualizes the interpolated data with contour plots using Matplotlib and Cartopy, complete with coastlines and axis labels.

Ensemble Spread Analysis

  • Efficient Processing:
    Process ensemble spread data from tar archives with parallel processing support. Reads ensemble data directly from .tar archives containing .nc4 files.
  • Parallel Processing: Processes multiple input files efficiently using Python's multiprocessing.
  • Data Version Control (icechunk): Uses icechunk for efficient versioning and storage of the processed (e.g., averaged) time-series data. Supports tracking history via commits.
  • Time Series Analysis:
    Track ensemble spread evolution with Hovmöller diagrams for both 2D and 3D variables.
  • Incremental Updates & Overwriting:
    • Smart processing that skips already processed timestamps for efficient updates.
    • Skips processing for timestamps already present in the icechunk repository (based on get_existing_times).
    • Includes a force_rerun flag in the parallel_process_files_2d3d function to allow users to bypass skipping and force reprocessing for a specific date range (e.g., if input data changed).

Usage Example

Basic Cubed-Sphere Data Processing

from m21ctools.data_handler import CubedSphereData

# Initialize the CubedSphereData object with your NetCDF file path, time and level (indices), variable name, and grid resolution value.
data_handler = CubedSphereData(
    file_path="path/to/your/datafile.nc4",
    time=0,
    lev=0,
    variable="QV",
    resolution=1.0
)

# Access raw and cleaned data.
raw_data = data_handler.raw_data
clean_data = data_handler.raw_data_cleaned

# Retrieve aggregated latitudes, longitudes, and data as flat 1D arrays.
all_lats, all_lons, all_data = data_handler.all_lats, data_handler.all_lons, data_handler.all_data

# Interpolate data to a uniform latitude-longitude grid.
lat_grid, lon_grid, data_grid = data_handler.interpolate_to_latlon_grid(method='linear')  # Default interpolation method is 'linear'

# Visualize the data.
data_handler.plot_data(lat_grid, lon_grid, data_grid)

Ensemble Spread Analysis

from datetime import datetime
import icechunk
import m21ctools.data_handler as m21c_handler
import m21ctools.config as cfg

# Define time range and variables
start_date = datetime(2010, 1, 4, 0) # Example date range
end_date = datetime(2010, 1, 5, 0)
force_reprocess = False # Set to True to force overwrite for this range


# Use variables from config or define manually
var3d_list = cfg.DEFAULT_VAR3D # Example: ['u']
var2d_list = cfg.DEFAULT_VAR2D # Example: ['ps']

# Initialize icechunk repository
icechunk_repo_path = "ensemble_store"
storage = icechunk.local_filesystem_storage(icechunk_repo_path)
icechunk_repo = icechunk.Repository.open_or_create(storage)

# Get already processed timestamps
times_to_skip = m21c_handler.get_existing_times(icechunk_repo, 'u')

# Process new data with parallel processing
combined_averages = m21c_handler.parallel_process_files_2d3d(
    start_date, end_date,
    var3d_list, var2d_list,
    skip_times=times_to_skip,
    num_workers=4,
    force_rerun=force_reprocess
)

# Save to icechunk repository
m21c_handler.save_to_icechunk(icechunk_repo, combined_averages, "Processed new data")

# Load and visualize
u_data = m21c_handler.load_from_icechunk(icechunk_repo, 'u')
ps_data = m21c_handler.load_from_icechunk(icechunk_repo, 'ps')

# Create Hovmöller diagrams
m21c_handler.plot_hovmoeller_3d(u_data, var='u')  # For 3D variables
m21c_handler.plot_hovmoeller_2d(ps_data, var='ps')  # For 2D variables

For more examples, check the examples/ directory in the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

m21ctools-0.2.0.tar.gz (278.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

m21ctools-0.2.0-py2.py3-none-any.whl (13.4 kB view details)

Uploaded Python 2Python 3

File details

Details for the file m21ctools-0.2.0.tar.gz.

File metadata

  • Download URL: m21ctools-0.2.0.tar.gz
  • Upload date:
  • Size: 278.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for m21ctools-0.2.0.tar.gz
Algorithm Hash digest
SHA256 16beaf86f5d2387199c712c7f5e04bdc448fafe4af6f5f8c0899ea29af6ce348
MD5 db597cd32f295a501fe6376c466421f2
BLAKE2b-256 5bdd1cbb09998eeb8f932236a7c9e240d68f3d1e59b6ca62fcd691bbca5b7937

See more details on using hashes here.

Provenance

The following attestation bundles were made for m21ctools-0.2.0.tar.gz:

Publisher: python-publish.yml on ftgoktas/m21ctools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file m21ctools-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: m21ctools-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 13.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for m21ctools-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 93a5782191a630a53b91e41191e4b79c5ebfd6662b600a1b8b2ab63851a25167
MD5 796449bbd424650fb2489637f85e030a
BLAKE2b-256 734832863f49a5fb02819c2855d8f8c4b2f0f9f381505aa5b9dcb1160f3e74d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for m21ctools-0.2.0-py2.py3-none-any.whl:

Publisher: python-publish.yml on ftgoktas/m21ctools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page