Skip to main content

Kernels for higher order gradients of Flash Attention.

Project description

Flash Hog

Flash Hog Logo

This repo contains the code for Flash Higher-Order-Gradients, aka. Flash Hog. This kernel achieves around a 3.7x speedup over an XLA optimized kernel, with linear memory scaling instead of quadratic scaling.

Hog Speedup

Installation

TODO

Method

Flash Hog does 4 recomputation passes to avoid any atomics or saving any intermediary tensors of shape (N_Q, N_K). This shakes out to be thread-wise tiling across Q in 3 passes first, once to compute dd, then once for b, then once for both dQ' and ddO. Finally we do another pass tiled over K, producing dK' and dV'. The equations we implement are the following:

Equations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flash_hog-0.4.1.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flash_hog-0.4.1-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file flash_hog-0.4.1.tar.gz.

File metadata

  • Download URL: flash_hog-0.4.1.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for flash_hog-0.4.1.tar.gz
Algorithm Hash digest
SHA256 ae695621015fecd3b33f433f577ed996b7b7df91c833d17f200052197cea3497
MD5 c6c0ac912a3ec814aae58e9b4f45ddd8
BLAKE2b-256 cd1e1551579d430b6af03780fe19076dba5154049a9877c5e83a02dec99350d8

See more details on using hashes here.

File details

Details for the file flash_hog-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: flash_hog-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for flash_hog-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b484db2781dde10bf797abe79e060d1357107231f36bb097c8dbf4e292466a20
MD5 85e6102dd5449049264fa8f021b1a76f
BLAKE2b-256 dd4ccb108414535462dfd6c926b2a639d93161861da4d746a5af56b30c43afdd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page