Skip to main content

Fast marine heatwaves and extrem events detector based on https://github.com/ecjoliver/marineHeatWaves

Project description

MHW Detector

Marine heatwaves detector based on https://github.com/ecjoliver/marineHeatWaves.

This package integrates a numba optimised version of ecjoliver's implementation for MHW detection with multiprocessing capabilities to compute detection over every coordinates of the dataset.

This code is not only for detecting MHW. It can also be used to detect extrem events of any variables like chla, pH, O2, etc ...

Installation

pip install mhw-detect

Dependencies

  • xarray
  • numba
  • scipy
  • dask
  • numpy
  • pandas
  • netcdf4
  • click
  • pyarrow

Usage

Configuration file

With mhw-detect no need for kilometers of parameters in command line. You just need to write a configuration file in which you put every parameters like an identity card of your next detection.

data:
  data :
    path : '/folder/sst.nc'
    var : 'sst'
  clim :
    path : '/folder_clim/clim.nc'
    var : 'sst'
  percent :
    path : '/folder_percent/percentile.nc'
    var : 'sst'
  # Optionnal
  offset :
    path : '/folder_offset/offset.nc'
    var : 'sst'

params:
    depth : 0
    climatologyPeriod : [null, null] # ex: [1983, 2012]
    pctile : 90
    windowHalfWidth : 5
    smoothPercentile : True
    smoothPercentileWidth : 31
    minDuration : 5
    joinAcrossGaps : True
    maxGap : 2
    maxPadLength: False
    coldSpells : False
    Ly : False

cut:
  nb_lat : 157
  nb_lon : 72

output_detection : '/my/path/to/folder_result/'
  • data : specifies the paths and variables you want to use. Do not specify clim and percent if you want them to be computed during the detection. Offset is an optionnal parameter as a 2D dataset to add an offset to the percentile.
  • params : specifies the parameters of the detection. See section below.
  • cut : specifies the number of latitude and longitude for geospatial dataset cutting.
  • output_detection : specifies the folder in which to save the results.

Detection parameters

From https://github.com/ecjoliver/marineHeatWaves.

climatologyPeriod      Period over which climatology is calculated, specified
                        as list of start and end years. Default ([null, null]) is to calculate
                        over the full range of years in the supplied time series.
                        Alternate periods suppled as a list e.g. [1983,2012].
                        Unused if precalculated clim and percentile are set.

pctile                 Threshold percentile (%) for detection of extreme values
                        (DEFAULT = 90)

windowHalfWidth        Width of window (one sided) about day-of-year used for
                        the pooling of values and calculation of threshold percentile
                        (DEFAULT = 5 [days])

smoothPercentile       Boolean switch indicating whether to smooth the threshold
                        percentile timeseries with a moving average (DEFAULT = True)

smoothPercentileWidth  Width of moving average window for smoothing threshold
                        (DEFAULT = 31 [days])

minDuration            Minimum duration for acceptance detected MHWs
                        (DEFAULT = 5 [days])

joinAcrossGaps         Boolean switch indicating whether to join MHWs
                        which occur before/after a short gap (DEFAULT = True)

maxGap                 Maximum length of gap allowed for the joining of MHWs
                        (DEFAULT = 2 [days])

maxPadLength           Specifies the maximum length [days] over which to interpolate
                        (pad) missing data (specified as nans) in input temp time series.
                        i.e., any consecutive blocks of NaNs with length greater
                        than maxPadLength will be left as NaN. Set as an integer.
                        (DEFAULT = False, interpolates over all missing values).

coldSpells             Specifies if the code should detect cold events instead of
                        heat events. (DEFAULT = False)

Datasets coordinates

All datasets must have lat/latitude, lon/longitude and time as coordinates. depth coordinate is allowed for the main dataset. Currently, the depth as to be specified via its index in the coordinate array. Giving directly the wanted depth will be added later.

The percentile dataset (and offset if used) must have a quantile coordinate as a dimension for the variable. It is useful in the case you want to do the detection with different quantile (90, 99).

Step 1 : Geospatial cutting (optionnal but recommended)

To use multiprocessing efficiently, the datasets must be cut in several smaller datasets over the lat/lon dimensions. Call mhw-cut with your config file to make it happen. Each sub-datasets will be called Cut_X.nc where X is the number of the cut (that is why your datasets (data, clim, percentile) must be in different folders).

The number of cuts does not matter, chunk size does. To find suitables nb_lat and nb_lon, it is better to use a notebook.

import xarray as xr ds = xr.open_dataset('dataset.nc', chunks={'latitude': nb_lat, 'longitude': nb_lon}) ds

nb_lat and nb_lon should be multiples of the latitude and longitude dimensions and choose carefully to have chunks of size over 10Mb (see Dask documentation for more details). Printing ds on a notebook gives you the size of the chunk (cut).

Please note that this step will double the space used in your disk.

Step 2 : Detection

Call mhw-detect to detect MHW. With multiprocessing, each cut is processed in parallel. For a cut, you will get in output_detection a text file with the results of the detection. Finally, when all the detections are done, every text files are concatenated into one csv (with ; as a separator).

If you do not want to use multiprocessing just to make a detection on a small geospatial subset, you can give the option -g lat_min lat_max lon_min lon_max to the command.

Commands

Geospatial cut

Usage: mhw-cut [OPTIONS]

  Cut datasets in multiple files

Options:
  -c, --config PATH  Specify configuration file  [required]
  --help             Show this message and exit.

Detection

Usage: mhw-detect [OPTIONS]

  Detect extreme events

Options:
  -c, --config PATH               Specify configuration file  [required]

  -g, --geographical-subset <FLOAT RANGE FLOAT RANGE FLOAT RANGE FLOAT RANGE>...
                                  The geographical subset as minimal latitude,
                                  maximal latitude, minimal longitude and
                                  maximal longitude.

                                  If set, the detection will be done on the
                                  subsetted global dataset given in config
                                  file (not the cuts) and sequentially.

  --categ-map TEXT                Generate category map in a netcdf file.

  --output-df TEXT                Give a name to the output dataframe. Two
                                  extensions are available: csv and parquet
                                  (default).     Save in csv if you want to
                                  open the dataframe with excel. Parquet is
                                  more efficient and takes less disk space.
                                  [default: data.parquet]

  --help                          Show this message and exit.

Output

Here is an example of the output csv file. Every detected MHW are listed.

lat lon time_deb time_end time_peak duration duration_mod duration_str duration_sev duration_ext categ imax imean ivar rate_onset rate_decline
0 -76.916664 -180.0 2003-01-01 2003-01-18 2003-01-04 18 5.0 2.0 5.0 6.0 4.0 2.341543 1.415663 0.551971 0.1867 0.2049
1 -76.916664 -180.0 2003-04-10 2003-10-23 2003-04-13 197 0.0 0.0 0.0 0.0 4.0 4.2300e-8 -4.8800e-10 7.0377e-9 1.5032e-8 2.7109e-10
2 -76.916664 -180.0 2003-12-18 2003-12-23 2003-12-21 6 0.0 1.0 2.0 3.0 4.0 2.325211 1.858383 0.367969 0.482987 0.613132
3 -76.916664 -179.9166 2003-01-01 2003-01-18 2003-01-04 18 5.0 2.0 5.0 6.0 4.0 2.327172 1.420817 0.544248 0.182604 0.203315

To do

  • Add shapefile usage.
  • Add an option to remove text files.

Contribution

  • Install poetry in your python/conda environment.
  • Clone this repository and checkout to your branch.
  • Create a poetry environment for dev by running poetry install.
  • Make your dev.
  • Test your dev with the command poetry run python monscript.py ....
  • Commit and push.
  • Ask for a merge request.

References

Hobday, A.J. et al. (2016), A hierarchical approach to defining marine heatwaves, Progress in Oceanography, 141, pp. 227-238, doi: 10.1016/j.pocean.2015.12.014 pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mhw_detect-0.2.3.tar.gz (27.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mhw_detect-0.2.3-py3-none-any.whl (26.1 kB view details)

Uploaded Python 3

File details

Details for the file mhw_detect-0.2.3.tar.gz.

File metadata

  • Download URL: mhw_detect-0.2.3.tar.gz
  • Upload date:
  • Size: 27.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.10.9 Linux/5.10.0-21-amd64

File hashes

Hashes for mhw_detect-0.2.3.tar.gz
Algorithm Hash digest
SHA256 996b59d3661aaa88123d28a640acc29e1846d67caeb5580d7ccc3555f4a3924f
MD5 8dd53dda12d8022364e161a045d36552
BLAKE2b-256 d83c799ea2f47ed3bd9a3c549eabea93625bc43663dca791af75c1cf6bbfcb1e

See more details on using hashes here.

File details

Details for the file mhw_detect-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: mhw_detect-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 26.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.10.9 Linux/5.10.0-21-amd64

File hashes

Hashes for mhw_detect-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 938a5e188a993f83a167bc9f6db6c94e382f8131336d129d1036d6698dcfcbbe
MD5 7640905a3f46fd8546db6d307c8eabc8
BLAKE2b-256 a53df3cd0453ee793aa41dbfa634f03d3dfa85bf3ff1e4a9a2498cdb5aa4c768

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page