Skip to main content

Kernels for higher order gradients of Flash Attention.

Project description

Flash Hog

Flash Hog Logo

This repo contains the code for Flash Higher-Order-Gradients, aka. Flash Hog. This kernel achieves around a 3.7x speedup over an XLA optimized kernel, with linear memory scaling instead of quadratic scaling.

Hog Speedup

Installation

TODO

Method

Flash Hog does 4 recomputation passes to avoid any atomics or saving any intermediary tensors of shape (N_Q, N_K). This shakes out to be thread-wise tiling across Q in 3 passes first, once to compute dd, then once for b, then once for both dQ' and ddO. Finally we do another pass tiled over K, producing dK' and dV'. The equations we implement are the following:

Equations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flash_hog-0.4.2.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flash_hog-0.4.2-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file flash_hog-0.4.2.tar.gz.

File metadata

  • Download URL: flash_hog-0.4.2.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for flash_hog-0.4.2.tar.gz
Algorithm Hash digest
SHA256 e262877c5a5299de5ff9ec96ee09f40d08f41e9dcf2434e7fa3fd46cacc64455
MD5 a5148de7ac7f0e0dd2c8d5c9aa74b048
BLAKE2b-256 45c0ac7963aad337d674f504ff949b3cb3068eee6c5cb82052e8a5b853918d33

See more details on using hashes here.

File details

Details for the file flash_hog-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: flash_hog-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for flash_hog-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3eaa5431834bf18c93006082122128f48b9157e7ab12df6432f40b5726cc8c08
MD5 a9191c47bffc7b4b87c8b06bc32cd533
BLAKE2b-256 acb35f59eaf4dc536a8210a4fbc99ff2de1b6974c96f099d5289efb3fc249cef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page