Skip to main content

Python package for accessing DAZ and FTZ flags

Project description

SSE flags

NumPy for x86 platforms (IA-32 and AMD64 architectures) uses SSE and/or AVX for floating-point calculations. Unfortunately, on Intel CPUs, they work very slowly with subnormal (denormal) numbers. To avoid such performance degradation, if somewhat worse floating-point accuracy in extreme cases can be tolerated, the DAZ (denormals-are-zero) and FTZ (flush-to-zero) CPU flags were introduced to treat input and/or output subnormal numbers as zeros. This module provides access to these CPU flags from Python.

To test the effect on your system, use sseflags.benchmark.run() or run

    python3 -m sseflags.benchmark

in the command line. Example output on Intel i9-12900K (subnormal numbers are very slow):

    Times in milliseconds:
    normal     0.037
    default    1.979
    ========================
             FTZ off  FTZ on
    ------------------------
    DAZ off    1.993   2.037
    DAZ on     6.669   0.037
    ========================

AMD CPUs do not show performance degradation on subnormal numbers in the 64-bit mode, and thus enabling DAZ/FTZ can only decrease the accuracy slightly. Example benchmarks on AMD Ryzen 7 6800U (negligible degradation for subnormal numbers; notice that times are in microseconds):

    Times in microseconds:
    normal    14.434
    default   16.834
    ========================
             FTZ off  FTZ on
    ------------------------
    DAZ off   16.829  15.383
    DAZ on    15.353  14.500
    ========================

Nevertheless, DAZ/FTZ might be useful in 32-bit Python (same CPU, noticeable difference):

    Times in milliseconds:
    normal     0.132
    default    0.229
    ========================
             FTZ off  FTZ on
    ------------------------
    DAZ off    0.225   0.131
    DAZ on     0.224   0.131
    ========================

On other architectures, or if the underlying Cython extension is not built, the module only reports that it has no effect.

sseflags module

type Flags = {'daz': bool | None, 'ftz': bool | None}


get_flags() -> sseflags.Flags
    Query current states of the DAZ and FTZ flags, see set_flags() for details.
    Can be used for restoring the default behavior:

        flags = get_flags()            # remember the original flag states
        set_flags(daz=True, ftz=True)  # enable DAZ and FTZ
        ...                            # do some calculations
        set_flags(**flags)             # restore the original flag states

    Returns
    -------
    flags : dict
        dictionary with the keys 'daz' and 'ftz', values of which represent the
        corresponding flag state: True for set, False for cleared, None if not
        implemented


set_flags(
    daz: bool | None = None,
    ftz: bool | None = None,
    verbose: bool = False
) -> bool
    Set the DAZ (denormals-are-zero) and/or FTZ (flush-to-zero) CPU flags for
    SSE and AVX floating-point calculations, which can be useful for Intel CPUs
    that work very slowly with subnormal (denormal) numbers.

    On unsupported architectures, or if the underlying Cython extension was not
    built, this function only reports that it has no effect. The availability
    can be checked by calling set_flags() without arguments.

    Parameters
    ----------
    daz : bool or None, optional
        True to set, False to clear the DAZ flag; None (default) to leave
        unchanged

    ftz : bool or None, optional
        True to set, False to clear the FTZ flag; None (default) to leave
        unchanged

    verbose : bool, optional
        pass True to print a warning if the operation is not implemented

    Returns
    -------
    implemented : bool
        True if this operation is implemented, False if not

sseflags.benchmark submodule

run(repeat: int = 100, min_t: float = 1.0, verbose: bool = True) -> None
    Run benchmarks with all possible combinations of the DAZ and FTZ flags to
    check their effect on NumPy performance (see run_flags() for details).

    Parameters
    ----------
    repeat : int, optional
        number of iterations in a batch

    min_t : float, optional
        minimal amount of time in seconds to benchmark each combination

    verbose : bool, optional
        pass False to suppress the progress report


run_flags(
    flags: Union[sseflags.Flags, Literal['default', 'normal']],
    repeat: int = 100,
    min_t: float = 1.0
) -> float
    Set the DAZ and FTZ flags to given states and run a benchmark of NumPy
    matrix multiplication. Each iteration involves multiplication of normal
    numbers that would produce subnormal numbers and multiplication of
    subnormal numbers by normal numbers, which also would produce subnormal
    numbers.

    The test is designed for clear demonstration of performance degradation (if
    it is present); the effect for real-world data is usually less severe.

    Parameters
    ----------
    flags : dict or str
        dictionary with arguments passed to sseflags.set_flags() after creating
        subnormal test data;

        flags='default' benchmark without changing the flags (thus test data
        might be missing subnormal numbers, which corresponds to running
        self-contained calculations but does not represent calculations with
        external data);

        flags='normal' benchmark normal numbers for reference (should not
        depend on the flags)

    repeat : int, optional
        number of iterations in a batch

    min_t : float, optional
        batches are repeated until this amount of seconds passes

    Returns
    -------
    time : float
        average time per iteration in seconds

Installation

Compiled wheels for Linux, macOS and Windows can be installed from PyPI. They use “Stable ABI” that should be compatible with all Python versions ⩾3.10. For portability, a “universal wheel” is also available, which does not contain the Cython extension, and thus has no effect on computations, but can be installed on unsupported systems. It can still benchmark the performance difference between subnormal and normal numbers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sseflags-0.2.tar.gz (8.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sseflags-0.2-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

sseflags-0.2-cp310-abi3-win_amd64.whl (17.2 kB view details)

Uploaded CPython 3.10+Windows x86-64

sseflags-0.2-cp310-abi3-win32.whl (16.4 kB view details)

Uploaded CPython 3.10+Windows x86

sseflags-0.2-cp310-abi3-musllinux_1_2_x86_64.whl (16.4 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

sseflags-0.2-cp310-abi3-musllinux_1_2_i686.whl (16.6 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ i686

sseflags-0.2-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (16.1 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

sseflags-0.2-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (16.0 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ i686manylinux: glibc 2.5+ i686

sseflags-0.2-cp310-abi3-macosx_10_9_x86_64.whl (14.5 kB view details)

Uploaded CPython 3.10+macOS 10.9+ x86-64

File details

Details for the file sseflags-0.2.tar.gz.

File metadata

  • Download URL: sseflags-0.2.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.2.tar.gz
Algorithm Hash digest
SHA256 727a9cbb7b4355651457c790fa0c31dbf72211f29a173921042fff939d89bfc6
MD5 47c5fa45cbb73868bbce01c503fa555e
BLAKE2b-256 47716f6001f2debb21c3698a6ca8e21840d7b899dac54a05665f3e9b8acd626d

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.2.tar.gz:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.2-py3-none-any.whl.

File metadata

  • Download URL: sseflags-0.2-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b88b2b57ba7282a11eb74b776bcabfaa389ba6b9506eac8ce2eef974ef78b515
MD5 8d21ad3b056bd1abd7578398a344ad04
BLAKE2b-256 9a831ff5e89b9210a34ba1669d062bd3cc0f93081c7ebae003596a26c292f8c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.2-py3-none-any.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.2-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: sseflags-0.2-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.2-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 8372041a3c05a4da385d4b0b253d72b582bfd9c7520fdffa01b9262a36f7f582
MD5 22c1953a55faafb3c39e1103d438790a
BLAKE2b-256 478f56d3d00d563ae5d38ea68a6ad946735d2296f5fd08b2d2ed100ee7236e1c

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.2-cp310-abi3-win_amd64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.2-cp310-abi3-win32.whl.

File metadata

  • Download URL: sseflags-0.2-cp310-abi3-win32.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sseflags-0.2-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 d195fc2841fbb5b56da3e6735d0e98c46a6f6f66a7463929694aeae7edd6f54c
MD5 f63e48376d4d5931cb5f9efa9781d939
BLAKE2b-256 7cc9ec9c9f3167a8db4aa34bfe2e827f189a7ea2b8f14cba0a127ac8486a6b08

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.2-cp310-abi3-win32.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.2-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for sseflags-0.2-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 00210dbc004277c842bae6ec34f9ae2b084f533fc470ecb16001767435d77226
MD5 8828f940816f755a8ca2590e094deccf
BLAKE2b-256 936100190cf9bcdc1f66d6ffa1d736ee8e80f1f0d49e9d582f2abfbcba6da6f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.2-cp310-abi3-musllinux_1_2_x86_64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.2-cp310-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for sseflags-0.2-cp310-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 88bf0578429756214954cb4db040a5b905484f6c3a344b4a863081291f6ce470
MD5 889a132bffa1daf9949575cdb3591640
BLAKE2b-256 8ecd7b3a49d6fba04958d08e74fbb45f017227a197e2b2eb3e2302bac47d3b19

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.2-cp310-abi3-musllinux_1_2_i686.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.2-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.

File metadata

File hashes

Hashes for sseflags-0.2-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl
Algorithm Hash digest
SHA256 9c4667c308bbcf2a9d17e2c0a3c1908702f57363c7213b0d4067040f2ae42f63
MD5 cd84b078cc5000daef940e18058e2abc
BLAKE2b-256 a91514f45c66b26af14faf225536cf3203d53d60d98e39b459377b7f8a8fd102

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.2-cp310-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.2-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl.

File metadata

File hashes

Hashes for sseflags-0.2-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl
Algorithm Hash digest
SHA256 3d860e1d4a081814da632ba70e1fcfca795df3e9c4e7e06c7d16d1ec9f2c4862
MD5 ad8cdb08e344022ab2eb44792fd86dd2
BLAKE2b-256 4e60515d4c269b5e182bf7da87b333cdfbae4ab9d8127984e39df20f507667a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.2-cp310-abi3-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sseflags-0.2-cp310-abi3-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for sseflags-0.2-cp310-abi3-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c98f5c784dc95bf420b519b1e64ac8dc0eee5917c37f4a1638da1fca74d13200
MD5 d4fad5a99e628c19a1a4c657f265a14d
BLAKE2b-256 3e19d00983473d9b4ddab216892c05a3c3345e4aaef7c137cc34a9e6662fde63

See more details on using hashes here.

Provenance

The following attestation bundles were made for sseflags-0.2-cp310-abi3-macosx_10_9_x86_64.whl:

Publisher: publish.yml on MikhailRyazanov/SSEflags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page