package for metocean statistics
Project description
SeaStats
seastats is a simple package to compare and analyse 2 time series. We use the following convention in this repo:
sim: modelled surge time seriesmod: observed surge time series
The main function is:
def get_stats(
sim: Series,
obs: Series,
metrics: Sequence[str] = SUGGESTED_METRICS,
quantile: float = 0,
cluster: int = 72,
round: int = -1
) -> dict[str, float]
Calculates various statistical metrics between the simulated and observed time series data.
Parameters:
- sim (pd.Series). The simulated time series data.
- obs (pd.Series). The observed time series data.
- metrics (list[str]). (Optional) The list of statistical metrics to calculate. If metrics = ["all"], all items in
SUPPORTED_METRICSwill be calculated. Default is all items inSUGGESTED_METRICS. - quantile (float). (Optional) Quantile used to calculate the metrics. Default is
0(no selection) - cluster (int). (Optional) Cluster duration for grouping storm events. Default is
72hours. - round (int). (Optional) Apply rounding to the results to. Default is no rounding (value is
-1)
Returns a dictionary containing the calculated metrics and their corresponding values. With 2 types of metrics:
- The "general" metrics: All the basic metrics needed for signal comparison (RMSE, RMS, Correlation etc..). See details below
bias: Biasrmse: Root Mean Square Errorrms: Root Mean Squarerms_95: Root Mean Square for data points above 95th percentilesim_mean: Mean of simulated valuesobs_mean: Mean of observed valuessim_std: Standard deviation of simulated valuesobs_std: Standard deviation of observed valuesmae: Mean Absolute Errormse: Mean Square Errornse: Nash-Sutcliffe Efficiencylamba: Lambda indexcr: Pearson Correlation coefficientcr_95: Pearson Correlation coefficient for data points above 95th percentileslope: Slope of Model/Obs correlationintercept: Intercept of Model/Obs correlationslope_pp: Slope of Model/Obs correlation of percentilesintercept_pp: Intercept of Model/Obs correlation of percentilesmad: Mean Absolute Deviationmadp: Mean Absolute Deviation of percentilesmadc:mad + madpkge: Kling–Gupta Efficiency
- The storm metrics: a PoT selection is done on the observed signal (using the
match_extremes()function). Function returns the decreasing extreme event peak values for observed and modeled signals (and time lag between events). See details below.R1: Difference between observed and modelled for the biggest stormR1_norm: Normalized R1 (R1 divided by observed value)R3: Average difference between observed and modelled for the three biggest stormsR3_norm: Normalized R3 (R3 divided by observed value)error: Average difference between observed and modelled for all stormserror_norm: Normalized error (error divided by observed value)
General metrics
A. Dimensional Statistics:
Mean Error (or Bias)
$$\langle x_c - x_m \rangle = \langle x_c \rangle - \langle x_m \rangle$$
RMSE (Root Mean Squared Error)
$$\sqrt{\langle(x_c - x_m)^2\rangle}$$
Mean-Absolute Error (MAE):
$$\langle |x_c - x_m| \rangle$$
B. Dimentionless Statistics (best closer to 1)
Performance Scores (PS) or Nash-Sutcliffe Eff (NSE): $$1 - \frac{\langle (x_c - x_m)^2 \rangle}{\langle (x_m - x_R)^2 \rangle}$$
Correlation Coefficient (R):
$$\frac {\langle x_{m}x_{c}\rangle -\langle x_{m}\rangle \langle x_{c}\rangle }{{\sqrt {\langle x_{m}^{2}\rangle -\langle x_{m}\rangle ^{2}}}{\sqrt {\langle x_{c}^{2}\rangle -\langle x_{c}\rangle ^{2}}}}$$
Kling–Gupta Efficiency (KGE):
$$1 - \sqrt{(r-1)^2 + b^2 + (g-1)^2}$$ with :
rthe correlationbthe modified bias term (see ref) $$\frac{\langle x_c \rangle - \langle x_m \rangle}{\sigma_m}$$gthe std dev term $$\frac{\sigma_c}{\sigma_m}$$
Lambda index ($\lambda$), values closer to 1 indicate better agreement:
$$\lambda = 1 - \frac{\sum{(x_c - x_m)^2}}{\sum{(x_m - \overline{x}_m)^2} + \sum{(x_c - \overline{x}_c)^2} + n(\overline{x}_m - \overline{x}_c)^2 + \kappa}$$
- with
kappa$$2 \cdot \left| \sum{((x_m - \overline{x}_m) \cdot (x_c - \overline{x}_c))} \right|$$
Storm metrics
The functions uses the match_extremes() function (detailed below) and returns:
R1: the error for the biggest stormR3: the mean error for the 3 biggest stormserror: the mean error for all the storms above the threshold.R1_norm/R3_norm/error: Same methodology, but values are in normalised (in %) relatively to the observed peaks.
case of NaNs
The storm_metrics() might return:
{'R1': np.nan,
'R1_norm': np.nan,
'R3': np.nan,
'R3_norm': np.nan,
'error': np.nan,
'error_norm': np.nan}
Extreme events
Example of implementation:
from seastats.storms import match_extremes
extremes_df = match_extremes(sim, obs, 0.99, cluster = 72)
extremes_df
The modeled peaks are matched with the observed peaks. Function returns a pd.DataFrame of the decreasing observed storm peaks as follows:
| time observed | observed | time observed | model | time model | diff | error | error_norm | tdiff |
|---|---|---|---|---|---|---|---|---|
| 2022-01-29 19:30:00 | 0.803 | 2022-01-29 19:30:00 | 0.565 | 2022-01-29 17:00:00 | -0.237 | 0.237 | 0.296 | -2.5 |
| 2022-02-20 20:30:00 | 0.639 | 2022-02-20 20:30:00 | 0.577 | 2022-02-20 20:00:00 | -0.062 | 0.062 | 0.0963 | -0.5 |
| ... | ||||||||
| 2022-11-27 15:30:00 | 0.386 | 2022-11-27 15:30:00 | 0.400 | 2022-11-27 17:00:00 | 0.014 | 0.014 | 0.036 | 1.5 |
with:
diffthe difference between modeled and observed peakserrorthe absolute difference between modeled and observed peakstdiffthe time difference between modeled and observed peaks
NB: the function uses pyextremes in the background, with PoT method, using the quantile value of the observed signal as physical threshold and passes the cluster_duration argument.
this happens when the function storms/match_extremes.py couldn't finc concomitent storms for the observed and modeled time series.
Usage
see notebook for details
get all metrics in a 3 liner:
from seastats import get_stats, GENERAL_METRICS_ALL, STORM_METRICS_ALL
general = get_stats(sim, obs, metrics = GENERAL_METRICS)
storm = get_stats(sim, obs, quantile = 0.99, metrics = STORM_METRICS) # we use a different quantile for PoT selection
pd.DataFrame(dict(general, **storm), index=['abed'])
| bias | rmse | rms | rms_95 | sim_mean | obs_mean | sim_std | obs_std | nse | lamba | cr | cr_95 | slope | intercept | slope_pp | intercept_pp | mad | madp | madc | kge | R1 | R1_norm | R3 | R3_norm | error | error_norm | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| abed | -0.007 | 0.086 | 0.086 | 0.088 | -0 | 0.007 | 0.142 | 0.144 | 0.677 | 0.929 | 0.817 | 0.542 | 0.718 | -0.005 | 1.401 | -0.028 | 0.052 | 0.213 | 0.265 | 0.81 | 0.237364 | 0.295719 | 0.147163 | 0.207019 | 0.0938142 | 0.177533 |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seastats-0.1.0.tar.gz.
File metadata
- Download URL: seastats-0.1.0.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.11.10 Linux/6.8.0-49-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
076b7a67a2954c9be2125627e8873be399b30ffb3ece16820e6fc20fb28e11ed
|
|
| MD5 |
a21e7458c6f681b6a21ef2185cc4a954
|
|
| BLAKE2b-256 |
869e74d4b70bf1cc4c76f4fe73c1163b30c844acd2ce04f9dd93712d730892a8
|
File details
Details for the file seastats-0.1.0-py3-none-any.whl.
File metadata
- Download URL: seastats-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.11.10 Linux/6.8.0-49-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
427d246e1c0a8bd78c19eb040add8d3c3340456a3637edf499820606ab427820
|
|
| MD5 |
7321f088674a310d11f61291c233e37f
|
|
| BLAKE2b-256 |
a15b72f11a1fba22c88eea97668e77c34aa217f94e2985eb87abf085ce02ece4
|