Skip to main content

Python package for accessing DAZ and FTZ flags

Project description

SSE flags

NumPy for x86 platforms (IA-32 and AMD64 architectures) uses SSE and/or AVX for floating-point calculations. Unfortunately, on Intel CPUs, they work very slowly with subnormal (denormal) numbers. To avoid such performance degradation, if somewhat worse floating-point accuracy in extreme cases can be tolerated, the DAZ (denormals-are-zero) and FTZ (flush-to-zero) CPU flags were introduced to treat input and/or output subnormal numbers as zeros. This module provides access to these CPU flags from Python.

To test the effect on your system, use sseflags.benchmark.run() or run

    python3 -m sseflags.benchmark

in the command line. Example output on Intel i9-12900K (subnormal numbers are very slow):

    Times in milliseconds:
    normal     0.037
    default    1.979
    ========================
             FTZ off  FTZ on
    ------------------------
    DAZ off    1.993   2.037
    DAZ on     6.669   0.037
    ========================

AMD CPUs do not show performance degradation on subnormal numbers in the 64-bit mode, and thus enabling DAZ/FTZ can only decrease the accuracy slightly. Example benchmarks on AMD Ryzen 7 6800U (negligible degradation for subnormal numbers; notice that times are in microseconds):

    Times in microseconds:
    normal    14.434
    default   16.834
    ========================
             FTZ off  FTZ on
    ------------------------
    DAZ off   16.829  15.383
    DAZ on    15.353  14.500
    ========================

Nevertheless, DAZ/FTZ might be useful in 32-bit Python (same CPU, noticeable difference):

    Times in milliseconds:
    normal     0.132
    default    0.229
    ========================
             FTZ off  FTZ on
    ------------------------
    DAZ off    0.225   0.131
    DAZ on     0.224   0.131
    ========================

AArch64 (ARM64) has the FZ flag to treat subnormal numbers as zeros similarly to both DAZ and FTZ together. Therefore, enabling it with the same interface is implemented here for compatibility. ARM CPUs with the FEAT_AFP feature (for example, Apple M3 but not M1) also support “alternate floating-point behavior” providing control equivalent to separate DAZ and FTZ flags; however, this is not implemented yet (mostly due to the lack of testing).

On other architectures, or if the underlying Cython extension is not built, the module only reports that it has no effect.

sseflags module

type Flags = {'daz': bool | None, 'ftz': bool | None}


get_flags() -> sseflags.Flags
    Query current states of the DAZ and FTZ flags, see set_flags() for details.
    Can be used for restoring the default behavior:

        flags = get_flags()            # remember the original flag states
        set_flags(daz=True, ftz=True)  # enable DAZ and FTZ
        ...                            # do some calculations
        set_flags(**flags)             # restore the original flag states

    Returns
    -------
    flags : dict
        dictionary with the keys 'daz' and 'ftz', values of which represent the
        corresponding flag state: True for set, False for cleared, None if not
        implemented


set_flags(
    daz: bool | None = None,
    ftz: bool | None = None,
    verbose: bool = False
) -> bool
    Set the DAZ (denormals-are-zero) and/or FTZ (flush-to-zero) CPU flags for
    SSE and AVX floating-point calculations, which can be useful for Intel CPUs
    that work very slowly with subnormal (denormal) numbers.

    On AArch64 (ARM64) CPUs, both DAZ and FTZ are represented by the FZ flag,
    thus the daz and ftz parameters must be equal.

    On unsupported architectures, or if the underlying Cython extension was not
    built, this function only reports that it has no effect. The availability
    can be checked by calling set_flags() without arguments.

    Parameters
    ----------
    daz : bool or None, optional
        True to set, False to clear the DAZ flag; None (default) to leave
        unchanged

    ftz : bool or None, optional
        True to set, False to clear the FTZ flag; None (default) to leave
        unchanged

    verbose : bool, optional
        pass True to print a warning if the operation is not implemented

    Returns
    -------
    implemented : bool
        True if this operation is implemented and supported, False if not

sseflags.benchmark submodule

run(repeat: int = 100, min_t: float = 1.0, verbose: bool = True) -> None
    Run benchmarks with all possible combinations of the DAZ and FTZ flags to
    check their effect on NumPy performance (see run_flags() for details).

    Parameters
    ----------
    repeat : int, optional
        number of iterations in a batch

    min_t : float, optional
        minimal amount of time in seconds to benchmark each combination

    verbose : bool, optional
        pass False to suppress the progress report


run_flags(
    flags: Union[sseflags.Flags, Literal['default', 'normal']],
    repeat: int = 100,
    min_t: float = 1.0
) -> float | None
    Set the DAZ and FTZ flags to given states and run a benchmark of NumPy
    matrix multiplication. Each iteration involves multiplication of normal
    numbers that would produce subnormal numbers and multiplication of
    subnormal numbers by normal numbers, which also would produce subnormal
    numbers.

    The test is designed for clear demonstration of performance degradation (if
    it is present); the effect for real-world data is usually less severe.

    Parameters
    ----------
    flags : dict or str
        dictionary with arguments passed to sseflags.set_flags() after creating
        subnormal test data;

        flags='default' benchmark without changing the flags (thus test data
        might be missing subnormal numbers, which corresponds to running
        self-contained calculations but does not represent calculations with
        external data);

        flags='normal' benchmark normal numbers for reference (should not
        depend on the flags)

    repeat : int, optional
        number of iterations in a batch

    min_t : float, optional
        batches are repeated until this amount of seconds passes

    Returns
    -------
    time : float or None
        average time per iteration in seconds; None if the flags cannot be set

sseflags.test submodule

run() -> None
    Basic tests to check the DAZ and FTZ flags and their effect on operations
    involving subnormal numbers.

    n is the smallest normal number, s = n / 2 is a subnormal number;
    the operation s * 2 should produce a normal number (or zero with DAZ),
    the operation n / 2 should produce a subnormal number (or zero with FTZ).

Installation

Compiled wheels for Linux, macOS and Windows can be installed from PyPI. They use “Stable ABI” that should be compatible with all Python versions ⩾3.10. For portability, a “universal wheel” is also available, which does not contain the Cython extension, and thus has no effect on computations, but can be installed on unsupported systems. It can still benchmark the performance difference between subnormal and normal numbers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sseflags-0.3.tar.gz (10.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sseflags-0.3-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

sseflags-0.3-cp310-abi3-win_arm64.whl (17.7 kB view details)

Uploaded CPython 3.10+Windows ARM64

sseflags-0.3-cp310-abi3-win_amd64.whl (19.1 kB view details)

Uploaded CPython 3.10+Windows x86-64

sseflags-0.3-cp310-abi3-win32.whl (18.3 kB view details)

Uploaded CPython 3.10+Windows x86

sseflags-0.3-cp310-abi3-musllinux_1_2_x86_64.whl (18.4 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

sseflags-0.3-cp310-abi3-musllinux_1_2_i686.whl (18.6 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ i686

sseflags-0.3-cp310-abi3-musllinux_1_2_aarch64.whl (19.1 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

sseflags-0.3-cp310-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (19.2 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

sseflags-0.3-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (18.1 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

sseflags-0.3-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (17.9 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ i686manylinux: glibc 2.5+ i686

sseflags-0.3-cp310-abi3-macosx_11_0_arm64.whl (17.0 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

sseflags-0.3-cp310-abi3-macosx_10_9_x86_64.whl (16.4 kB view details)

Uploaded CPython 3.10+macOS 10.9+ x86-64

File details

Details for the file sseflags-0.3.tar.gz.

File metadata

  • Download URL: sseflags-0.3.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.3.tar.gz
Algorithm Hash digest
SHA256 0ea50a736ffa3853918eef7a58f2ec07dbe35346308a8101d3a16951bfd570e2
MD5 4f3960baa91b28a42192d575598aeafd
BLAKE2b-256 537e998406a28bec2a6b73d5fff882c0e89b90938a83dc62aa99cc8f8c1093f7

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3.tar.gz:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-py3-none-any.whl.

File metadata

  • Download URL: sseflags-0.3-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 456c312c15b25038411eac939709dbcb68dc12682cb9ccdc771dec6e6306de7b
MD5 a9e744a59c458ed0cc61c84eb8751bd6
BLAKE2b-256 30ebf5e6f61782deb3850cc54e5576a39708fd286817795d88561a44ce0a9e81

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-py3-none-any.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-win_arm64.whl.

File metadata

  • Download URL: sseflags-0.3-cp310-abi3-win_arm64.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: CPython 3.10+, Windows ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.3-cp310-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 40bb63a1d309c102dd363543ed0dd9b6b028602b6fc056d5835b2144a1c45aec
MD5 0a166a6e81c799bfcece2d3f87bd3f4c
BLAKE2b-256 93e56c1985b40ab2700b0c58d01dfcd42b8e6bf9775e965d8dc68b8b4e453707

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-win_arm64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: sseflags-0.3-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 19.1 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.3-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 437d7d5ac22774a0bcd0e29369cf0d19a067b17dcb7cc2705c12daad71f85bb1
MD5 6b9e27e11f1521bae66480735d78fb2e
BLAKE2b-256 6263e7439e3af4a93704826982566dc2b54e65c332a3426b20ca0621e8886f69

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-win_amd64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-win32.whl.

File metadata

  • Download URL: sseflags-0.3-cp310-abi3-win32.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.3-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 d99ea4f61f39deeaaeb9ed331146d658a0cb21f85c63bc55f700003b5bb00be2
MD5 62d3246f806728bbc7ee36dc03328072
BLAKE2b-256 746dd99802076ebe9f4942d723cb1ef05e8b42452235b931345f085455dec800

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-win32.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for sseflags-0.3-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 a9b29343b985ded3f588c55d76b1d5c3680e475e447b99c36f9d581ca84cf6b8
MD5 c76b5b74143f700f191066d564fdcdc8
BLAKE2b-256 df263ff0cff3a8365e2e62098a85a4452c60d585195484ee5a2901fd99e1edfa

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-musllinux_1_2_x86_64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for sseflags-0.3-cp310-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 7da7ea79bbea18b397fd4b5c59b36eafff256e98230eefca3f1164d5c2d8be95
MD5 400eb5d457a9281871df6860a944081a
BLAKE2b-256 49e1bfffda7e0b1474564ea1ed460cb7714f676b01b1737fcf71948a7f8146eb

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-musllinux_1_2_i686.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for sseflags-0.3-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 350d59fe8302f23042cc3c4abc3c09d7b7aa377e9e6a6b3ec616cadd878aa0c3
MD5 67a27b8ddeb4214b51e658e314e98f55
BLAKE2b-256 1f55e06b313ea21a48d28dc793d908f155c17c97e5f69401abb84bb4865fd25b

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-musllinux_1_2_aarch64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for sseflags-0.3-cp310-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5a91c231de60d7901cc4c838289e17e57b5203d1993d68719d31cab60dafb93e
MD5 f3eb2843e7253171808b4d3905d3a287
BLAKE2b-256 8b04b3674d297b59a9f94580d9660cf4abc0b5392f14c7772a4ac5b9b4832708

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.

File metadata

File hashes

Hashes for sseflags-0.3-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl
Algorithm Hash digest
SHA256 ef753981e1ba76b8e1f682df443117ea94b4cec9148869168528677b93bc5972
MD5 1320b40cb8486c4fa071d643b83f0a73
BLAKE2b-256 a081035d7a1c41edbdf0cdb9dd5bd7ff539319cf2f51973ec73f1812ce6c050b

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl.

File metadata

File hashes

Hashes for sseflags-0.3-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl
Algorithm Hash digest
SHA256 d75ebc395c9bb4802cc3f8d3b4e4de173de193eacf03139d302f34e71b59e6f0
MD5 d06854e048eb299e3ab6cd314993a7f6
BLAKE2b-256 0b8c0f7e3b43c2061500f464cddf2c2bded3fead922128acd82658fc98a066bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for sseflags-0.3-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dbd56f3826bfe459eefb6c907c231644e84222c0ed9ca90a06b692de834d227c
MD5 e028405017f4ba475470694af4004842
BLAKE2b-256 cfbde5ed3e39030f5a439c69c25d0afbc68edd5bd39ce763384ec84fc90c4f3a

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.3-cp310-abi3-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for sseflags-0.3-cp310-abi3-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a225b0234e052b1917f82e758077eec5ea77434217cf9dd066ed79b67d2f8da1
MD5 bf1da459b52c5d6a8bc0ae4192b85377
BLAKE2b-256 9996996313d298ce99bde1136452ffc67d0fdab21c7d780c7b269db09b0386ca

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.3-cp310-abi3-macosx_10_9_x86_64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page