Skip to main content

Kernels for higher order gradients of Flash Attention.

Project description

Flash Hog

Flash Hog Logo

This repo contains the code for Flash Higher-Order-Gradients, aka. Flash Hog. This kernel achieves around a 3.7x speedup over an XLA optimized kernel, with linear memory scaling instead of quadratic scaling.

Hog Speedup

Installation

TODO

Method

Flash Hog does 4 recomputation passes to avoid any atomics or saving any intermediary tensors of shape (N_Q, N_K). This shakes out to be thread-wise tiling across Q in 3 passes first, once to compute dd, then once for b, then once for both dQ' and ddO. Finally we do another pass tiled over K, producing dK' and dV'. The equations we implement are the following:

Equations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flash_hog-0.4.0.tar.gz (19.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flash_hog-0.4.0-py3-none-any.whl (26.6 kB view details)

Uploaded Python 3

File details

Details for the file flash_hog-0.4.0.tar.gz.

File metadata

  • Download URL: flash_hog-0.4.0.tar.gz
  • Upload date:
  • Size: 19.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for flash_hog-0.4.0.tar.gz
Algorithm Hash digest
SHA256 5a2797d914f454948d4fa1c20286ccb9c3959eb80c5d1d03a6809f3cc0f50883
MD5 33dc2f85acbc3331a95052fe01ce5f31
BLAKE2b-256 49cf15f3e731adaa16d8626db1426ab2865c9aa915b9017d7a7d43acee36bd9e

See more details on using hashes here.

File details

Details for the file flash_hog-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: flash_hog-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 26.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for flash_hog-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6dbab2281ad5273e93f4c3366c1c6fd2e25671fc087d9d37e9d76a98ad5727e9
MD5 f7161f7d7de82fc8a6dc579f00ea07d3
BLAKE2b-256 71041283b6eeadd1632c79c6d3988492799f20c1405445cee7847e1cf682ec36

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page