Solid stochastic statistic analysis of Stochastic Arithmetic

These details have not been verified by PyPI

Project links

Bug Tracker

Development Status
- 4 - Beta
License
- OSI Approved :: Apache Software License
Operating System
- POSIX :: Linux
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Information Analysis

Project description

significantdigits package - v0.4.0

Compute the number of significant digits based on the paper Confidence Intervals for Stochastic Arithmetic. This package is also inspired by the Jupyter Notebook included with the publication.

Getting started
Installation
Advanced Usage
Recent Improvements
Testing
License

Getting started

This synthetic example illustrates how to compute significant digits of a results sample with a given known reference:

>>> import significantdigits as sd
>>> import numpy as np
>>> from numpy.random import uniform as U
>>> np.random.seed(0)
>>> eps = 2**-52
>>> # simulates results with epsilon differences
>>> X = [1+U(-1,1)*eps for _ in range(10)]
>>> sd.significant_digits(X, reference=1)
>>> 51.02329058847853

or with the CLI interface assuming X is in test.txt:

> significantdigits --metric significant -i "$(cat test.txt)" --input-format stdin --reference 1
> (51.02329058847853,)

If the reference is unknown, one can use the sample average:

...
>>> sd.significant_digits(X, reference=np.mean(X))
>>> 51.02329058847853

To print the result as mean +/- error, use the format_uncertainty function:

>>> print(sd.format_uncertainty(X, reference=1))
>>> ['+1.00000000000000000 ± 1.119313369151395181e-16'
     '+1.00000000000000000 ± 1.119313369151395181e-16'
     '+1.00000000000000000 ± 1.119313369151395181e-16'
     '+1.00000000000000000 ± 1.119313369151395181e-16'
     '+1.00000000000000000 ± 1.119313369151395181e-16'
     '+1.00000000000000000 ± 1.119313369151395181e-16'
     '+1.00000000000000000 ± 1.119313369151395181e-16'
     '+1.00000000000000022 ± 1.119313369151395181e-16'
     '+1.00000000000000022 ± 1.119313369151395181e-16'
     '+1.00000000000000000 ± 1.119313369151395181e-16']

Installation

python3 -m pip install -U significantdigits

or if you want the latest version of the code, you can install it from the repository directly

python3 -m pip install -U git+https://github.com/verificarlo/significantdigits.git
# or if you don't have 'git' installed
python3 -m pip install -U https://github.com/verificarlo/significantdigits/zipball/master

Examples

The examples directory contains several example scripts demonstrating how to use the significantdigits package in different scenarios. You can find practical usage patterns, sample data, and step-by-step guides to help you get started or deepen your understanding of the package's features.

Advanced Usage

Inputs types

Functions accept the following types of inputs:

    InputType: ArrayLike

Those types are accessible with the numpy.typing.ArrayLike type.

Z computation

Metrics are computed using Z, the distance between the samples and the reference. There are four possible cases depending on the distance and the nature of the reference that are summarized in this table:

	constant reference (x)	random variable reference (Y)
Absolute precision	Z = X - x	Z = X - Y
Relative precision	Z = X/x - 1	Z = X/Y - 1

_compute_z(array: InternalArrayType, 
           reference: InternalArrayType | None, 
           error: Error | str, 
           axis: int, 
           shuffle_samples: bool = False) -> InternalArrayType
    Compute Z, the distance between the random variable and the reference

    Compute Z, the distance between the random variable and the reference
    with three cases depending on the dimensions of array and reference:

    X = array
    Y = reference

    Three cases:
    - Y is none
        - The case when X = Y
        - We split X in two and set one group to X and the other to Y
    - X.ndim == Y.ndim
        X and Y have the same dimension
        It it the case when Y is a random variable
    - X.ndim - 1 == Y.ndim or Y.ndim == 0
        Y is a scalar value

    Parameters
    ----------
    array : InternalArrayType
        The random variable
    reference : InternalArrayType | None
        The reference to compare against
    error : Error | str
        The error function to use to compute error between array and reference.
    axis : int, default=0
        The axis or axes along which compute Z
    shuflle_samples : bool, default=False
        If True, shuffles the groups when the reference is None

    Returns
    -------
    array : InternalArrayType
        The result of Z following the error method choose
    scaling_factor : InternalArrayType
        The scaling factor to compute the significant digits
        Useful for absolute error to normalizing the number of significant digits
        ``When Y is a random variable, we choose e = ⎣log_2|E[Y]|⎦+1.``p.10:9

Methods

Two methods exist for computing both significant and contributing digits depending on whether the sample follows a Centered Normal distribution or not. You can pass the method to the function by using the Method enum provided by the package. The functions also accept the name as a string "cnh" for Method.CNH and "general" for Method.General.

class Method(AutoName):
    """
    CNH: Centered Normality Hypothesis
         X follows a Gaussian law centered around the reference or
         Z follows a Gaussian law centered around 0
    General: No assumption about the distribution of X or Z
    """
    CNH = auto()
    General = auto()

Significant digits

significant_digits(array: InputType,
                   reference: ReferenceType | None = None,
                   axis: int = 0, 
                   basis: int = 2,
                   error: Error | str,
                   method: Method | str,
                   probability: float = 0.95,
                   confidence: float = 0.95,
                   shuffle_samples: bool = False,
                   dtype: DTypeLike | None = None
                   ) -> ArrayLike
    
    Compute significant digits

    This function computes with a certain probability
    the number of bits that are significant.

    Parameters
    ----------
    array: InputType
        Element to compute
    reference: ReferenceType | None, optional=None
        Reference for comparing the array
    axis: int, optional=0
        Axis or axes along which the significant digits are computed
    basis: int, optional=2
        Basis in which represent the significant digits
    error : Error | str, optional=Error.Relative
        Error function to use to compute error between array and reference.
    method : Method | str, optional=Method.CNH
        Method to use for the underlying distribution hypothesis
    probability : float, default=0.95
        Probability for the significant digits result
    confidence : float, default=0.95
        Confidence level for the significant digits result
    shuffle_samples : bool, optional=False
        If reference is None, the array is split in two and \
        comparison is done between both pieces. \
        If shuffle_samples is True, it shuffles pieces.
    dtype : dtype_like | None, default=None
        Numerical type used for computing significant digits
        Widest format between array and reference is taken if no supplied.

    Returns
    -------
    ndarray
        array_like containing significant digits

Contributing digits

contributing_digits(array: InputType,
                    reference: ReferenceType | None = None,
                    axis: int = 0,
                    basis: int = 2,
                    error: Error | str,
                    method: Method | str,
                    probability: float = 0.51,
                    confidence: float = 0.95,
                    shuffle_samples: bool = False,
                    dtype: DTypeLike | None = None
                    ) -> ArrayLike
    
    Compute contributing digits

    This function computes with a certain probability the number of bits
    of the mantissa that will round the result towards the correct reference
    value[1]_

    Parameters
    ----------
    array: InputArray
        Element to compute
    reference: ReferenceArray | None, default=None
        Reference for comparing the array
    axis: int, default=0
        Axis or axes along which the contributing digits are computed
        default: None
    basis: int, optional=2
        basis in which represent the contributing digits
    error : Error | str, default=Error.Relative
        Error function to use to compute error between array and reference.
    method : Method | str, default=Method.CNH
        Method to use for the underlying distribution hypothesis
    probability : float, default=0.51
        Probability for the contributing digits result
    confidence : float, default=0.95
        Confidence level for the contributing digits result
    shuffle_samples : bool, default=False
        If reference is None, the array is split in two and
        comparison is done between both pieces.
        If shuffle_samples is True, it shuffles pieces.
    dtype : dtype_like | None, default=None
        Numerical type used for computing contributing digits
        Widest format between array and reference is taken if no supplied.

    Returns
    -------
    ndarray
        array_like containing contributing digits

Formatting Results with `format_uncertainty`

Formats the results as mean ± error for each sample.

format_uncertainty(array: InputType,
                   reference: ReferenceType | None = None,
                   axis: int = 0,
                   error: Error | str = Error.Relative,
                   dtype: DTypeLike | None = None
                   ) -> list[str]
    Format the uncertainty of each sample as a string

    This function returns a list of strings representing each value in the input array
    formatted as "mean ± error", where the error is computed with respect to the reference.

    Parameters
    ----------
    array: InputType
        Array of values to format
    reference: ReferenceType | None, optional=None
        Reference value(s) for error computation
    axis: int, optional=0
        Axis along which to compute the mean and error
    error: Error | str, optional=Error.Relative
        Error function to use for uncertainty calculation
    dtype: DTypeLike | None, optional=None
        Numerical type used for computation

    Returns
    -------
    list[str]
        List of formatted strings for each sample

Utils function

These are utility functions for the general case.

`probability_estimation_general`

Estimates the lower bound probability given the sample size.

probability_estimation_general(success: int, trials: int, confidence: float) -> float
    Computes probability lower bound for Bernouilli process

    This function computes the probability associated with metrics
    computed in the general case (without assumption on the underlying
    distribution). Indeed, in that case the probability is given by the
    sample size with a certain confidence level.

    Parameters
    ----------
    success : int
        Number of success for a Bernoulli experiment
    trials : int
        Number of trials for a Bernoulli experiment
    confidence : float
        Confidence level for the probability lower bound estimation

    Returns
    -------
    float
        The lower bound probability with `confidence` level to have `success` successes for `trials` trials

`minimum_number_of_trials`

Returns the minimal sample size required to reach the requested probability and confidence.

minimum_number_of_trials(probability: float, confidence: float) -> int
    Computes the minimum number of trials to have probability and confidence

    This function computes the minimal sample size required to have
    metrics with a certain probability and confidence for the general case
    (without assumption on the underlying distribution).

    For example, if one wants significant digits with proabability p = 99%
    and confidence (1 - alpha) = 95%, it requires at least 299 observations.

    Parameters
    ----------
    probability : float
        Probability
    confidence : float
        Confidence

    Returns
    -------
    int
        Minimal sample size to have given probability and confidence

Recent Improvements

Bug Fixes & Reliability:

Fixed critical parameter validation bug in CLI argument handling
Corrected integer division precision issues in sample size calculations
Added missing return statements for error handling edge cases
Enhanced numerical stability for extreme values (inf/NaN handling)

Performance Optimizations (15-40% faster):

Optimized exponential operations using np.exp2() instead of 2**(-kth)
Enhanced bitwise operations with efficient & 1 masking
Improved memory allocation and array operations
Better conditional processing and vectorized computations

Comprehensive Test Suite (3x more tests):

Expanded from 51 to 153 total tests across 5 new test modules
Added property-based testing and fuzzing (65 tests)
Enhanced edge case coverage (26 tests)
Comprehensive validation and error handling tests (24 tests)
Performance regression testing and integration tests (38 tests)

Testing

The package includes a comprehensive test suite with 153 tests across multiple categories:

Running Tests

# Run all tests
pytest

# Run with performance tests (marked with @pytest.mark.performance)
pytest -m performance

# Run specific test categories
pytest tests/test_edge_cases.py      # Edge cases and numerical stability
pytest tests/test_validation.py     # Parameter validation and error handling
pytest tests/test_property_based.py # Property-based testing and fuzzing
pytest tests/test_integration.py    # End-to-end integration tests
pytest tests/test_performance.py    # Performance regression tests

Test Categories

Edge Cases (26 tests): Numerical stability, inf/NaN handling, extreme values
Validation (24 tests): Parameter validation, input sanitization, error handling
Property-Based (65 tests): Mathematical invariants, randomized testing, fuzzing
Integration (23 tests): CLI testing, file I/O, complete workflows
Performance (15 tests): Regression testing, optimization verification

Mathematical Properties Tested

Monotonicity: More precise data yields more significant digits
Scale Invariance: Relative error results are invariant under scaling
Basis Conversion: Consistent results across different number bases
Sample Size Effects: Larger samples generally provide better estimates
Method Consistency: CNH and General methods produce comparable results

License

This file is part of the Verificarlo project, under the Apache License v2.0 with LLVM Exceptions. SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception. See https://llvm.org/LICENSE.txt for license information.

Project details

These details have not been verified by PyPI

Project links

Bug Tracker

Development Status
- 4 - Beta
License
- OSI Approved :: Apache Software License
Operating System
- POSIX :: Linux
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Information Analysis

Release history Release notifications | RSS feed

This version

0.4.0

Jun 28, 2025

0.3.1

Jun 12, 2025

0.3.0

Dec 21, 2023

0.2.0

Aug 3, 2023

0.1.3

Aug 2, 2023

0.1.2

Feb 3, 2023

0.1.1

Jan 10, 2023

0.1.0

Nov 9, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

significantdigits-0.4.0.tar.gz (526.6 kB view details)

Uploaded Jun 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

significantdigits-0.4.0-py3-none-any.whl (21.4 kB view details)

Uploaded Jun 28, 2025 Python 3

File details

Details for the file significantdigits-0.4.0.tar.gz.

File metadata

Download URL: significantdigits-0.4.0.tar.gz
Upload date: Jun 28, 2025
Size: 526.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for significantdigits-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`d8b98ff91c4faabc693e788d3a34cdbfa9cd38538531b89e8f3e3444d3b7643a`
MD5	`642399d8a99374ee07ad85e8edd3a9eb`
BLAKE2b-256	`e0c81fc731f080f2a30ee3a1d1a596f3129333a66304d9a21dc64b2de4953fa8`

See more details on using hashes here.

File details

Details for the file significantdigits-0.4.0-py3-none-any.whl.

File metadata

Download URL: significantdigits-0.4.0-py3-none-any.whl
Upload date: Jun 28, 2025
Size: 21.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for significantdigits-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`733637e84a5d96be941030cecd9d9f6b406f98432a18f92d28440caada44dcd5`
MD5	`e47c9701e1162bb4e49c1222cfc4d2ef`
BLAKE2b-256	`9c683ea9482941f34d4dfe28cac6a05a9cc4c6cc2fceb51d841ebcaeaad5af7b`

See more details on using hashes here.

significantdigits 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

significantdigits package - v0.4.0

Table of Contents

Getting started

Installation

Examples

Advanced Usage

Inputs types

Z computation

Methods

Significant digits

Contributing digits

Formatting Results with format_uncertainty

Utils function

probability_estimation_general

minimum_number_of_trials

Recent Improvements

Testing

Running Tests

Test Categories

Mathematical Properties Tested

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Formatting Results with `format_uncertainty`

`probability_estimation_general`

`minimum_number_of_trials`