Skip to main content

Kernels for higher order gradients of Flash Attention.

Project description

Flash Hog

Flash Hog Logo

This repo contains the code for Flash Higher-Order-Gradients, aka. Flash Hog. This kernel achieves around a 3.7x speedup over an XLA optimized kernel, with linear memory scaling instead of quadratic scaling.

Hog Speedup

Installation

TODO

Method

Flash Hog does 4 recomputation passes to avoid any atomics or saving any intermediary tensors of shape (N_Q, N_K). This shakes out to be thread-wise tiling across Q in 3 passes first, once to compute dd, then once for b, then once for both dQ' and ddO. Finally we do another pass tiled over K, producing dK' and dV'. The equations we implement are the following:

Equations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flash_hog-0.4.3.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flash_hog-0.4.3-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file flash_hog-0.4.3.tar.gz.

File metadata

  • Download URL: flash_hog-0.4.3.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for flash_hog-0.4.3.tar.gz
Algorithm Hash digest
SHA256 7df8c88d203ee37cfa778462d90341cd96c967bddb13b97e0622bf863b4bf363
MD5 cf76e7a0b1cd0905459128bd4cea54e3
BLAKE2b-256 7e9407e48e7cdbbf65ece778e5c181867925b8b0369519b8e22c466589a2fb3f

See more details on using hashes here.

File details

Details for the file flash_hog-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: flash_hog-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for flash_hog-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 87791306302ee16b38e7cee8ed5ac72d2ed06f3e0ee5bff84c0b3143caea6be5
MD5 68c2186820805a5f9e4d97b57613c47e
BLAKE2b-256 057a00bca9eff1579e69de4bb87394ad6dc6145d2338ae51c64832d9be3e1c59

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page