Skip to main content

Python package for accessing DAZ and FTZ flags

Project description

SSE flags

NumPy for x86 platforms (IA-32 and AMD64 architectures) uses SSE and/or AVX for floating-point calculations. Unfortunately, on Intel CPUs, they work very slowly with subnormal (denormal) numbers. To avoid such performance degradation, if somewhat worse floating-point accuracy in extreme cases can be tolerated, the DAZ (denormals-are-zero) and FTZ (flush-to-zero) CPU flags were introduced to treat input and/or output subnormal numbers as zeros. This module provides access to these CPU flags from Python.

To test the effect on your system, use sseflags.benchmark.run() or run

    python3 -m sseflags.benchmark

in the command line. Example output on Intel i9-12900K (subnormal numbers are very slow):

    Times in milliseconds:
    normal     0.037
    default    1.979
    ========================
             FTZ off  FTZ on
    ------------------------
    DAZ off    1.993   2.037
    DAZ on     6.669   0.037
    ========================

AMD CPUs do not show performance degradation on subnormal numbers in the 64-bit mode, and thus enabling DAZ/FTZ can only decrease the accuracy slightly. Example benchmarks on AMD Ryzen 7 6800U (negligible degradation for subnormal numbers; notice that times are in microseconds):

    Times in microseconds:
    normal    14.434
    default   16.834
    ========================
             FTZ off  FTZ on
    ------------------------
    DAZ off   16.829  15.383
    DAZ on    15.353  14.500
    ========================

Nevertheless, DAZ/FTZ might be useful in 32-bit Python (same CPU, noticeable difference):

    Times in milliseconds:
    normal     0.132
    default    0.229
    ========================
             FTZ off  FTZ on
    ------------------------
    DAZ off    0.225   0.131
    DAZ on     0.224   0.131
    ========================

AArch64 (ARM64) has the FZ flag to treat subnormal numbers as zeros similarly to both DAZ and FTZ together. Therefore, controlling it with the same interface is implemented here for compatibility. On ARM CPUs with the FEAT_AFP feature (for example, Apple M3 but not M1), “alternate floating-point behavior” is used for controlling “DAZ” and “FTZ” separately.

On other architectures, or if the underlying Cython extension is not built, the module only reports that it has no effect.

sseflags module

type Flags = {'daz': bool | None, 'ftz': bool | None}


get_flags() -> sseflags.Flags
    Query current states of the DAZ and FTZ flags, see set_flags() for details.
    Can be used for restoring the default behavior:

        flags = get_flags()            # remember the original flag states
        set_flags(daz=True, ftz=True)  # enable DAZ and FTZ
        ...                            # do some calculations
        set_flags(**flags)             # restore the original flag states

    Returns
    -------
    flags : dict
        dictionary with the keys 'daz' and 'ftz', values of which represent the
        corresponding flag state: True for set, False for cleared, None if not
        implemented


set_flags(
    daz: bool | None = None,
    ftz: bool | None = None,
    verbose: bool = False
) -> bool
    Set the DAZ (denormals-are-zero) and/or FTZ (flush-to-zero) CPU flags for
    SSE and AVX floating-point calculations, which can be useful for Intel CPUs
    that work very slowly with subnormal (denormal) numbers.

    On AArch64 (ARM64) CPUs without FEAP_AFP (before Armv8.7/Armv9.2), both DAZ
    and FTZ are represented by the single FZ flag, thus the daz and ftz
    parameters must be equal.

    On unsupported architectures, or if the underlying Cython extension was not
    built, this function only reports that it has no effect. The availability
    can be checked by calling set_flags() without arguments.

    Parameters
    ----------
    daz : bool or None, optional
        True to set, False to clear the DAZ flag; None (default) to leave
        unchanged

    ftz : bool or None, optional
        True to set, False to clear the FTZ flag; None (default) to leave
        unchanged

    verbose : bool, optional
        pass True to print a warning if the operation is not implemented

    Returns
    -------
    implemented : bool
        True if this operation is implemented and supported, False if not

sseflags.benchmark submodule

run(repeat: int = 100, min_t: float = 1.0, verbose: bool = True) -> None
    Run benchmarks with all possible combinations of the DAZ and FTZ flags to
    check their effect on NumPy performance (see run_flags() for details).

    Parameters
    ----------
    repeat : int, optional
        number of iterations in a batch

    min_t : float, optional
        minimal amount of time in seconds to benchmark each combination

    verbose : bool, optional
        pass False to suppress the progress report


run_flags(
    flags: Union[sseflags.Flags, Literal['default', 'normal']],
    repeat: int = 100,
    min_t: float = 1.0
) -> float | None
    Set the DAZ and FTZ flags to given states and run a benchmark of NumPy
    matrix multiplication. Each iteration involves multiplication of normal
    numbers that would produce subnormal numbers and multiplication of
    subnormal numbers by normal numbers, which also would produce subnormal
    numbers.

    The test is designed for clear demonstration of performance degradation (if
    it is present); the effect for real-world data is usually less severe.

    Parameters
    ----------
    flags : dict or str
        dictionary with arguments passed to sseflags.set_flags() after creating
        subnormal test data;

        flags='default' benchmark without changing the flags (thus test data
        might be missing subnormal numbers, which corresponds to running
        self-contained calculations but does not represent calculations with
        external data);

        flags='normal' benchmark normal numbers for reference (should not
        depend on the flags)

    repeat : int, optional
        number of iterations in a batch

    min_t : float, optional
        batches are repeated until this amount of seconds passes

    Returns
    -------
    time : float or None
        average time per iteration in seconds; None if the flags cannot be set

sseflags.test submodule

run() -> None
    Basic tests to check the DAZ and FTZ flags and their effect on operations
    involving subnormal numbers.

    n is the smallest normal number, s = n / 2 is a subnormal number;
    the operation s * 2 should produce a normal number (or zero with DAZ),
    the operation n / 2 should produce a subnormal number (or zero with FTZ).

Installation

Compiled wheels for Linux, macOS and Windows can be installed from PyPI. They use “Stable ABI” that should be compatible with all Python versions ⩾3.10. For portability, a “universal wheel” is also available, which does not contain the Cython extension, and thus has no effect on computations, but can be installed on unsupported systems. It can still benchmark the performance difference between subnormal and normal numbers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sseflags-0.4a0.tar.gz (10.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sseflags-0.4a0-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

sseflags-0.4a0-cp310-abi3-win_arm64.whl (18.6 kB view details)

Uploaded CPython 3.10+Windows ARM64

sseflags-0.4a0-cp310-abi3-win_amd64.whl (19.8 kB view details)

Uploaded CPython 3.10+Windows x86-64

sseflags-0.4a0-cp310-abi3-win32.whl (19.0 kB view details)

Uploaded CPython 3.10+Windows x86

sseflags-0.4a0-cp310-abi3-musllinux_1_2_x86_64.whl (17.5 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

sseflags-0.4a0-cp310-abi3-musllinux_1_2_i686.whl (17.4 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ i686

sseflags-0.4a0-cp310-abi3-musllinux_1_2_aarch64.whl (17.7 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

sseflags-0.4a0-cp310-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (17.7 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

sseflags-0.4a0-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (17.5 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

sseflags-0.4a0-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (17.3 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ i686manylinux: glibc 2.5+ i686

sseflags-0.4a0-cp310-abi3-macosx_11_0_arm64.whl (18.1 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

sseflags-0.4a0-cp310-abi3-macosx_10_9_x86_64.whl (17.2 kB view details)

Uploaded CPython 3.10+macOS 10.9+ x86-64

File details

Details for the file sseflags-0.4a0.tar.gz.

File metadata

  • Download URL: sseflags-0.4a0.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.4a0.tar.gz
Algorithm Hash digest
SHA256 abad9cc14b5947950d2c08ab02119fdd8a2e6570b5860aa8c3a84d7a5295e44e
MD5 724b4450361ff2aa01da0e8e2cb9db37
BLAKE2b-256 f6b0ab9bbbef1aad6b2b2c76d5b73d0be35a7a7ee8147610190cd10941491dfb

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0.tar.gz:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-py3-none-any.whl.

File metadata

  • Download URL: sseflags-0.4a0-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.4a0-py3-none-any.whl
Algorithm Hash digest
SHA256 45c10d7e95021059d071b9704f86ba8625e739a46091bc9c54e31e560d1b33db
MD5 7f0bcd0db4d8062404b285b99d6e83b0
BLAKE2b-256 a8eaafdd63d4a62f57c40ba1c2e3fc1a2d2484f7a748ef1427de4e1374d313d4

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-py3-none-any.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-win_arm64.whl.

File metadata

  • Download URL: sseflags-0.4a0-cp310-abi3-win_arm64.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: CPython 3.10+, Windows ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 44a4be20592f330dbcb8fe82049f455b9b6e3df3d92c7feb1e11268ff4193a0f
MD5 1ec5987d16cd01afd6743d60cbdc3370
BLAKE2b-256 32e5f82608373992ec380c8d928e1d68c2c479f00969235b030a208a46b01ede

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-win_arm64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: sseflags-0.4a0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 19.8 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 69984bc3d3b1138d117e93e74feda53fd141bbb51f842bc3c1b6179a4952247e
MD5 c43029563f5fb985ac637d3c12607545
BLAKE2b-256 da05d97d3b6e7916103dc9faa3edf8f46ca71fd11c6c036793b25dd59fb7d49e

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-win_amd64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-win32.whl.

File metadata

  • Download URL: sseflags-0.4a0-cp310-abi3-win32.whl
  • Upload date:
  • Size: 19.0 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 bf01676094ad29e8344bb9f246692e1abcc17c9952beebb4f86da10b4c84ebcc
MD5 0749fec8c93247c1fe504ca6f9e8a952
BLAKE2b-256 6b49df165f8e73d91937b3ddffbdb6bf45222eb087a365122d0c62368d8ddfea

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-win32.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 595c9f1077d0fa97bdd18f15b92142d3466574ca6c0fd7c2bdf054dfc26eb572
MD5 84a08ee461d5f087ed19c091558335bf
BLAKE2b-256 4622106e9b8617a9373dba6fc07a4b72aa0d4f2b4cea370abb5757bf5fbdc1ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-musllinux_1_2_x86_64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 830ebf6bda13fe9aee2564c71f9442461003fb1e391b3754ec0a6d87eb3073b2
MD5 90dd5302209dc4d4864d4c791f7cde26
BLAKE2b-256 bf3fe35b6943a9c878e0eb1507b277dcee5f554e8fe3edf88c1e6c82fc56356d

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-musllinux_1_2_i686.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 0e29e4769f67e80541db275edb1b3794024146907c736497baa8e7bd49731b9c
MD5 350b7f6c362b39c4e163ffe46f544d39
BLAKE2b-256 ffaed84c82d83cd2c146105502aa80cce3a13ad9b3229a50f538d2c5c61381f9

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-musllinux_1_2_aarch64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c0fecd11e5ae088689265f70abd399299a98fcaf796736297f9e97ccc3990298
MD5 45e60813534b489bfcb57890e0f044af
BLAKE2b-256 e57b09f12773c55921309bd90229f3e8e6cb537def56fd34fa9f21128f30cb0d

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.

File metadata

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl
Algorithm Hash digest
SHA256 0c4176bfb3e4957a6d491f5e729c08c8add482b6f8ca1a433c544e46d7726502
MD5 4705fed60c713f9ca318fadd3775bc33
BLAKE2b-256 9890350a7734468b97750f10b30ca88e17e58b22a832eb4ba1d7a4c79c9eae61

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl.

File metadata

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl
Algorithm Hash digest
SHA256 ba79ba612e5625184fa12fbd875cc5a21c2988a5814401a727e1cd82e31d2000
MD5 150d4d85ab91af9b3e538ddcca7e3d1e
BLAKE2b-256 01cea518e5aab05ba3d638bef07a84e09b33d8872ad4b409ead35c1eb323b642

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 467f1d84be4f5946ddfce59da1899f134175fc86608aae3238a94f334a9a4926
MD5 8ad7d671faa6c1f1e33a755dc4e717a5
BLAKE2b-256 fbf5eb6961a12230b3ae8ebafbf7432d288b5da9846105ad08d758cae7287b22

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.4a0-cp310-abi3-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for sseflags-0.4a0-cp310-abi3-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 627fb1baf5bc5e41df1e91bc286bbc0b92e8078d76de736be592381936ef7adb
MD5 80fb4a2d275bc772429284e034613b69
BLAKE2b-256 97399d0abdd1a31f713fc3216207b235efd69bd10272c4003bd1452e768e2b88

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.4a0-cp310-abi3-macosx_10_9_x86_64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page