Skip to main content

Flash Attention CUTE (CUDA Template Engine) implementation

Project description

FlashAttention-4 (CuTeDSL)

FlashAttention-4 is a CuTeDSL-based implementation of FlashAttention for Hopper and Blackwell GPUs.

Installation

pip install fa4

Usage

from flash_attn.cute import flash_attn_func, flash_attn_varlen_func

out = flash_attn_func(q, k, v, causal=True)

Development

git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
pip install -e "flash_attn/cute[dev]"
pytest tests/cute/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flash_attn_4-4.0.0b3.tar.gz (243.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flash_attn_4-4.0.0b3-py3-none-any.whl (261.7 kB view details)

Uploaded Python 3

File details

Details for the file flash_attn_4-4.0.0b3.tar.gz.

File metadata

  • Download URL: flash_attn_4-4.0.0b3.tar.gz
  • Upload date:
  • Size: 243.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for flash_attn_4-4.0.0b3.tar.gz
Algorithm Hash digest
SHA256 06b6cff3bc49afd48f5b7dfad7ab237c207333d4f66736e5b567da0937e5b0a5
MD5 5c9d9e2779eff8c8399a80a5177a4000
BLAKE2b-256 8a36aab028ba5843b3ba21d37434c2b8891e4f105ec586261960086903b34228

See more details on using hashes here.

File details

Details for the file flash_attn_4-4.0.0b3-py3-none-any.whl.

File metadata

  • Download URL: flash_attn_4-4.0.0b3-py3-none-any.whl
  • Upload date:
  • Size: 261.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for flash_attn_4-4.0.0b3-py3-none-any.whl
Algorithm Hash digest
SHA256 86183c0759689324224fa624437f64518a3b5852e5b7c77aab0a6d8a12ace537
MD5 e2fcc7bfdf1ccebc92f2c1a1a270ee65
BLAKE2b-256 d68550b336261a4e7c801215e2a6e8ef9cc6f236e8ad7ed6eea31e8cb66c1804

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page