Skip to main content

Fast Hadamard Transform in CUDA, with a PyTorch interface

Project description

Fast Hadamard Transform in CUDA, with a PyTorch interface

Features:

  • Support fp32, fp16, bf16, for dimension up to 32768.
  • Implicitly pad with zeros if dimension is not a power of 2.

Installation

git clone https://github.com/Dao-AILab/fast-hadamard-transform.git fast-hadamard-transform
cd fast-hadamard-transform
pip install -v .

How to use

from fast_hadamard_transform import hadamard_transform
def hadamard_transform(x, scale=1.0):
    """
    Arguments:
        x: (..., dim)
        scale: float. Multiply the output by this number.
    Returns:
        out: (..., dim)

    Multiply each row of x by the Hadamard transform matrix.
    Equivalent to F.linear(x, torch.tensor(scipy.linalg.hadamard(dim))) * scale.
    If dim is not a power of 2, we implicitly pad x with zero so that dim is the next power of 2.
    """

Speed

Benchmarked on A100, for not too small batch size, compared to memcpy (torch.clone), which is a lower bound for the time taken as we'd need to read inputs from GPU memory and write output to GPU memory anyway.

Data type Dimension Time taken vs memcpy
fp16/bf16 <= 512 1.0x
512 - 8192 <= 1.2x
16384 1.3x
32768 1.8x
fp32 <= 8192 1.0x
16384 1.1x
32768 1.2x

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast_hadamard_transform-1.1.0.tar.gz (7.5 kB view details)

Uploaded Source

File details

Details for the file fast_hadamard_transform-1.1.0.tar.gz.

File metadata

  • Download URL: fast_hadamard_transform-1.1.0.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for fast_hadamard_transform-1.1.0.tar.gz
Algorithm Hash digest
SHA256 7c03daf825dc74c7200605a5b758f2d5a3bc47ef3b817bf02e97b24076a83906
MD5 b07a6774985c2a3210adc613b041c435
BLAKE2b-256 81dc7afa54b951cac690742247d153e2e992e30677b944d7330bb7280903fb11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page