Skip to main content

A flash attention(s) implementation in triton.

Project description

Flash Attention X

Flash Attention X is a flash attention(s) implementation using Triton, including flash_attention_full, flash_attention_causal and flash_attention_bias...

Installation

You can install Flash Attention X using pip:

pip install flash_attention_x
pip install -e .

Requirements

  • Python >= 3.8
  • Triton >= 2.3.0

The package is tested with Triton 2.3.0 and CUDA 12.0.

Features

  • Efficient implementation of flash attention(s), including flash_attention_full, flash_attention_causal and flash_attention_bias...
  • Built using Triton for optimized performance

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Xiaotian Han

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flash_attention_x-0.0.1.tar.gz (6.7 kB view details)

Uploaded Source

File details

Details for the file flash_attention_x-0.0.1.tar.gz.

File metadata

  • Download URL: flash_attention_x-0.0.1.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for flash_attention_x-0.0.1.tar.gz
Algorithm Hash digest
SHA256 e83277a2bd3d7023d22cb5c8b6dd159d4bd99cabfce60dde9426f091b01f242a
MD5 df1f9ca384e86a8fcd8b6618162f10a8
BLAKE2b-256 c69e38f907af63cc1ca3ee67f6cbea8a40d44b62efbf6e0cb7a4c3c530e41874

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page