Skip to main content

Kernels for higher order gradients of Flash Attention.

Project description

Flash Hog

Flash Hog Logo

This repo contains the code for Flash Higher-Order-Gradients, aka. Flash Hog. This kernel achieves around a 3.7x speedup over an XLA optimized kernel, with linear memory scaling instead of quadratic scaling.

Hog Speedup

Installation

TODO

Method

Flash Hog does 4 recomputation passes to avoid any atomics or saving any intermediary tensors of shape (N_Q, N_K). This shakes out to be thread-wise tiling across Q in 3 passes first, once to compute dd, then once for b, then once for both dQ' and ddO. Finally we do another pass tiled over K, producing dK' and dV'. The equations we implement are the following:

Equations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flash_hog-0.4.4.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flash_hog-0.4.4-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file flash_hog-0.4.4.tar.gz.

File metadata

  • Download URL: flash_hog-0.4.4.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for flash_hog-0.4.4.tar.gz
Algorithm Hash digest
SHA256 8d8a150567992af3983286aab242391726d7ac516eb10e2f3756b54ea4c54a03
MD5 03a42ec8b6cec3d876bbe22b9f8e58c5
BLAKE2b-256 b048fa6cf1fd1e5fdfd1c7f8778d9bca996ffa3d5cf284283b0456e0c14d405a

See more details on using hashes here.

File details

Details for the file flash_hog-0.4.4-py3-none-any.whl.

File metadata

  • Download URL: flash_hog-0.4.4-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for flash_hog-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e534061c11fb7d60b26cf8049390b76d8aa94c755009701bcd15780c9e9fb006
MD5 6796a74b67db18ceb0c701aa3ac47d75
BLAKE2b-256 04497fa20f1c93f567fda9c2c12e1345133a0ac15d0faf9d80b9a892e1d09799

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page