Skip to main content

This package implements iterative algorithms to compute some basics statistics

Project description

BasicIterativeStatistics

In this repository, basic iterative statistics are implemented.

Installation

The python package iterative_stats is available on pypi.

 pip install iterative-stats

Otherwise, one can clone the following repository:

    git clone https://github.com/IterativeStatistics/BasicIterativeStatistics.git

To install the environnement, please use poetry:

    poetry install

NB: One can also use a conda or python environment. The list of dependencies are available into the pyproject.toml file.

To run the tests:

    poetry run pytest tests

or for a specific test (ex: tests/unit/test_IterativeMean.py)

    poetry run pytest tests/unit/test_IterativeMean.py

License

The python package iterative_stats free software distributed under the BSD 3-Clause License. The terms of the BSD 3-Clause License can be found in the file LICENSE.

Iterative statistics

In this repository, we implement the following statistics:

  • Mean (see examples here)
  • Variance (see examples here)
  • Higher-order moments, skewness and kurtosis (see examples here)
  • Extrema (see examples here)
  • Covariance (see examples here)
  • Threshold (see examples here) (count the number of threshold exceedances).
  • Quantile (see examples here): this statistics is still a work in progress and must be use with care!
  • Sobol indices

About the Iterative higher-order moments The iterative higher order moments are available as IterativeMoments and permit a user to compute higher-order moments up to the 4th order (including skewness and kurtosis). The implementation follows [5],

!!! danger "Beware of 4th order" The 4th order (kurtosis) does not pass our tests comparing to Scipy non-iterative Kurtosis calculations. While this is something to beware of, we also test the popular library OpenTurns, which also fails to pass the same test and uses the same equation as us. Follow discussion here.

About the quantiles: Following [4], we implements the Robbins-Monro (RM) algorithm for quantile estimation. The tuning parameters of this algorithm have been studied (through intensive numerical tests). In the implemented algorithm, the final number of iterations (i.e., the number of runs of the computer model) N is a priori fixed, which is a classical way of dealing with uncertainty quantization problems.

About the Sobol indices: It also contains more advanced statistics: the Sobol indices. For each method (Martinez, Saltelli and Jansen), the first order indices, computed iteratively, as well as the total orders are available. We also include the second order for the Martinez and Jansen methods (the second order for the Saltelli method is still a work in progress).

  • Pearson coefficient (Martinez): examples are available here.
  • Jansen method: examples are available here.
  • Saltelli method: examples are available here.

NB: This package contains also useful methods for performing iterative statistics computations such as shift averaging and shift dot product computation:

  • Shifted dot product (see example here)
  • Shifted mean (see example here)

Fault-Tolerance

For each statistics class, we implement a save_state() and load_from_state() methods to respectively save the current state and create a new object of type IterativeStatistics from a state object (a python dictionary).

These methods can be used as follows (as example):

iterativeMean = IterativeMean(dim=1, state=state)

# ... Do some computations

# Save the current state
state_obj = iterativeMean.save_state() 

# Reload an IterativeMean object of state state_obj
iterativeMean_reload = IterativeMean(dim=1, state=state_obj)

NB: the methods save_state() and load_from_state() are not available yet for the quantile and Saltelli Sobol indices. This is still a work in progress.

Examples

Here are some examples of how to use iterative-stats to compute Sobol index iteratively.

from iterative_stats.sensitivity.sensitivity_martinez import IterativeSensitivityMartinez as IterativeSensitivityMethod
dim = 10 #field size
nb_parms = 3 #number of parameters
second_order = True # a boolean to compute the second order or not

# Create an instance of the object IterativeSensitivityMethod
sensitivity_instance = IterativeSensitivityMethod(dim = dim, nb_parms = nb_parms, second_order = second_order)

# Generate an experimental design
from tests.mock.uniform_3d import Uniform3D
input_sample_generator = Uniform3D(nb_parms = nb_parms, nb_sim = nb_sim, second_order=second_order).generator()

# Load a function (here ishigami function)
from tests.mock.ishigami import ishigami
while True :
    try :
        # Generate the next sample
        input_sample = next(input_sample_generator)
        # Apply ishigami function
        output_sample = np.apply_along_axis(ishigami, 1,input_sample)
        # Update the sensitivity instance
        sensitivity_instance.increment(output_sample) 
    except StopIteration :
        break

first_order = sensitivity_instance.getFirstOrderIndices()
print(f" First Order Sobol indices (Martinez method): {first_order}")

total_order = sensitivity_instance.getFirstOrderIndices()
print(f" Total Order Sobol indices (Martinez method): {first_order}")

second_order = sensitivity_instance.getFirstOrderIndices()
print(f" Second Order Sobol indices (Martinez method): {first_order}")

NB: The computation of Sobol Indices requires the preparation of a specific experimental design based on the pick-freeze method (see [1] for details). This method has been implemented into the class AbstractExperiment and some examples can be found here.

References

The implementation of the iterative formulas is based on the following papers:

[1] Théophile Terraz, Alejandro Ribes, Yvan Fournier, Bertrand Iooss, and Bruno Raffin. 2017. Melissa: large scale in transit sensitivity analysis avoiding intermediate files. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '17). Association for Computing Machinery, New York, NY, USA, Article 61, 1–14. https://doi.org/10.1145/3126908.3126922

[2] M. Baudin, K. Boumhaout, T. Delage, B. Iooss, and J-M. Martinez. 2016. Numerical stability of Sobol' indices estimation formula. In Proceedings of the 8th International Conference on Sensitivity Analysis of Model Output (SAMO 2016). Le Tampon, Réunion Island, France.

[3] Philippe Pébay. 2008. Formulas for robust, one-pass parallel computation of covariances and arbitrary-order statistical moments. Sandia Report SAND2008-6212, Sandia National Laboratories 94 (2008).

[4] Iooss, Bertrand, and Jérôme Lonchampt. "Robust tuning of Robbins-Monro algorithm for quantile estimation-Application to wind-farm asset management." ESREL 2021. 2021.

[5] Meng, Xiangrui. "Simpler online updates for arbitrary-order central moments." arXiv preprint arXiv:1510.04923 (2015).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iterative_stats-0.1.2.tar.gz (13.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iterative_stats-0.1.2-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file iterative_stats-0.1.2.tar.gz.

File metadata

  • Download URL: iterative_stats-0.1.2.tar.gz
  • Upload date:
  • Size: 13.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for iterative_stats-0.1.2.tar.gz
Algorithm Hash digest
SHA256 40874dde372a648623b79c45ce9f1259c8616c8d5d0aafacefe65bfbb88f4525
MD5 f8cb5b1ce77766952e335c10c47b1d85
BLAKE2b-256 d1dc579469fb9a7e2124c5ffd887e94c5267e5e8f89b3bcc34d235db92bdfade

See more details on using hashes here.

File details

Details for the file iterative_stats-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: iterative_stats-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 19.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for iterative_stats-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 435ecdd5d6047150f3384b374a324ac3617ad5b66fa2ba583e92438b5917e4d0
MD5 9fe7b4ff3948e6d66a9ddea3ed93d324
BLAKE2b-256 9359f048ab03e17a228ce533a71037387e51aa9353cded6fbf75bafc177f9007

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page