Python code for simulating low precision floating-point arithmetic
Project description
pychop
A Python package for simulating low precision arithmetic
With the increasing availability and support of lower-precision floating-point arithmetics beyond the IEEE Standard 64-bit double and 32-bit single precisions in both hardware and software simulation, low-precision arithmetic operations as well as the number formats, e.g., 16-bit half precision, have been widely studied and exploited in numerous applications in the modern scientific computing and machine learning algorithms. Low-precision floating point arithmetic offers greater throughput, reduced data communication, and less energy usage. pychop
is a Python library for efficient quantization, it enable to convert single or double precision numbers into low-bitwidth representation. The purpose of pychop
, following the same function of chop
in Matlab provided by Nick higham, is to simulate the low precision formats as well as fixed-point/integer quantization based on single and double precisions, which is pravalent on modern machine architecture. pychop
also provides Torch and JAX backend, which enables to simulate training Neural Network in low precision.
This package provides consistent APIs to the chop software by Nick higham as much as possible. For the first four rounding mode, with the same user-specific parameters, pychop
generates exactly same result as that of the chop software. For stochastic rounding (rmode
as 5 and 6), both output same results if random numbers is given the same.
Install
The proper running environment of pychop
should by Python 3, which relies on the following dependencies:
- python > 3.8
- numpy >=1.7.3
- pandas >=2.0
- torch
- jax
To install the current current release via PIP use:
pip install pychop
The supported floating point formats
The supported floating point arithmetic formats include:
format | description |
---|---|
'q43', 'fp8-e4m3' | NVIDIA quarter precision (4 exponent bits, 3 significand (mantissa) bits) |
'q52', 'fp8-e5m2' | NVIDIA quarter precision (5 exponent bits, 2 significand bits) |
'b', 'bfloat16' | bfloat16 |
'h', 'half', 'fp16' | IEEE half precision (the default) |
's', 'single', 'fp32' | IEEE single precision |
'd', 'double', 'fp64' | IEEE double precision |
'c', 'custom' | custom format |
The code example can be found on the quick start page.
Contributing
We welcome contributions in any form! Assistance with documentation is always welcome. To contribute, feel free to open an issue or please fork the project make your changes and submit a pull request. We will do our best to work through any issues and requests.
Acknowledgement
This project is supported by the European Union (ERC, InEXASCALE, 101075632). Views and opinions expressed are those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.
To do
- Generate data using Nick's chop
- Expand unittests for
pychop
- Write quant_dot
- Write FMA
- BLAS
References
[1] Nicholas J. Higham and Srikara Pranesh, Simulating Low Precision Floating-Point Arithmetic, SIAM J. Sci. Comput., 2019.
[2] IEEE Standard for Floating-Point Arithmetic, IEEE Std 754-2019 (revision of IEEE Std 754-2008), IEEE, 2019.
[3] Intel Corporation, BFLOAT16---hardware numerics definition, 2018
[4] Muller, Jean-Michel et al., Handbook of Floating-Point Arithmetic, Springer, 2018
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pychop-0.2.9.tar.gz
.
File metadata
- Download URL: pychop-0.2.9.tar.gz
- Upload date:
- Size: 31.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a528efa8b51966e60cfa0c21db96f138eb5397dfd03e3af022ea4cf0742f43df |
|
MD5 | 32991d2427baeee941f09e12b0d49448 |
|
BLAKE2b-256 | ad9fe2e54e70afc21fedd0fccb2ab74a377b88bd95619de5d500e3943bf4c468 |