NATTEN

Neighborhood Attention Extension

These details have not been verified by PyPI

Project links

Project description

Fast Multi-dimensional Sparse Attention

Visit natten.org to view our documentation, install instructions, and pre-built wheels.

NATTEN is an open-source project dedicated to providing infrastructure for multi-dimensional sparse attention methods, specifically Neighborhood Attention (NA), a sliding window self-attention mechanism, and its extensions (dilated NA, causal NA, strided NA). We provide Fused Multi-Headed Attention (FMHA) and Fused Neighborhood Attention (FNA) training and inference kernels, for all NVIDIA architectures since Maxwell (SM50). We also ship Hopper (SM90) and Blackwell (SM100, SM103) native kernels, offering speedups proportional to reduction in FLOPs over cuDNN and Flash Attention 3.

Neighborhood Attention introduces locality and sparsity into self attention in a manner similar to convolution. This means for any self attention problem, you will be able to specify a kernel_size, stride, and dilation. Because it's attention, you can also toggle causal masking.

NATTEN is dedicated to multi-dimensional layouts of tokens (i.e. 2-D and 3-D feature maps). Users have the freedom to explore the massive parameter space that NATTEN offers, in which the attention span in any dimension/axis of your input can be controlled with its respective kernel_size, stride, dilation, and is_causal parameters.


`kernel_size=(6,6)`	`kernel_size=(6,6)`
	`dilation=(2,2)`


`kernel_size=(6,6)`	`kernel_size=(6,6)`
`is_causal=(True,True)`	`stride=(2,2)`

Getting started

NATTEN supports PyTorch >= 2.7, and Python >= 3.9 (everything PyTorch supports). Please refer to install instructions for details on how to install NATTEN.

Release `0.21.6`

NATTEN has undergone major changes since the last release (0.17.5), so we strongly recommend reading our new updated documentation in this webpage before upgrading.

Our latest release ships our Hopper FNA and Blackwell FNA kernels, bringing you massive speedups on modern data center class NVIDIA GPUs such as the H100 and B200. It also speeds up inference in our existing Ampere FNA kernels up to 1.47X in fully block-sparse cases, provides much cleaner error reporting, ships with our profiling toolkit, and so much more!

License

NATTEN is released under the MIT License.

Citation

If you found NATTEN, or neighborhood attention useful in your work, consider citing the appropriate papers:

Original neighborhood attention paper

First work proposing neighborhood attention, and introducing NATTEN.

@inproceedings{hassani2023neighborhood,
  title        = {Neighborhood Attention Transformer},
  author       = {Ali Hassani and Steven Walton and Jiachen Li and Shen Li and Humphrey Shi},
  year         = 2023,
  booktitle    = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}
}

Dilated neighborhood attention

Introduced dilation for introducing sparse global context.

@article{hassani2022dilated,
  title        = {Dilated Neighborhood Attention Transformer},
  author       = {Ali Hassani and Humphrey Shi},
  year         = 2022,
  journal      = {arXiv preprint arXiv:2209.15001}
}

GEMM-based and fused neighborhood attention

Introduced the first multi-dimensional attention kernels: GEMM-based and fused neighborhood attention (FNA).

Introduced causal neighborhood attention, and extended implementation to support varying parameters across different dimensions.

@inproceedings{hassani2024faster,
  title        = {Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level},
  author       = {Ali Hassani and Wen-Mei Hwu and Humphrey Shi},
  year         = 2024,
  booktitle    = {Advances in Neural Information Processing Systems},
}

Generalized neighborhood attention: towards speed-of-light performance

Introduced even-sized windows, strided neighborhood attention, block-sparse forms of neighborhood attention, NATTEN Simulator, and our new Hopper and Blackwell FNA kernels, implemented with out-of-kernel token permutation.

@article{hassani2025generalized,
  title        = {Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light},
  author       = {Hassani, Ali and Zhou, Fengzhe and Kane, Aditya and Huang, Jiannan and Chen, Chieh-Yun and Shi, Min and Walton, Steven and Hoehnerbach, Markus and Thakkar, Vijay and Isaev, Michael and others},
  year         = 2025,
  journal      = {arXiv preprint arXiv:2504.16922}
}

Acknowledgements

See CONTRIBUTORS.md.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.21.6

Apr 14, 2026

0.21.5

Feb 8, 2026

0.21.1

Oct 26, 2025

0.21.0

Jul 14, 2025

0.20.1

Jun 14, 2025

0.20.0

Jun 7, 2025

0.17.5

Mar 17, 2025

0.17.4

Jan 29, 2025

0.17.3

Nov 1, 2024

0.17.1

May 19, 2024

0.17.0

May 2, 2024

0.15.1

Jan 25, 2024

0.15.0

Jan 9, 2024

0.14.6

Mar 21, 2023

0.14.5

Mar 16, 2023

0.14.4

Nov 1, 2022

0.14.2

Oct 16, 2022

0.14.1 yanked

Oct 7, 2022

Reason this release was yanked:

Deprecated

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

natten-0.21.6.tar.gz (2.9 MB view details)

Uploaded Apr 14, 2026 Source

File details

Details for the file natten-0.21.6.tar.gz.

File metadata

Download URL: natten-0.21.6.tar.gz
Upload date: Apr 14, 2026
Size: 2.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for natten-0.21.6.tar.gz
Algorithm	Hash digest
SHA256	`c26871225fa533f8a9433ccd4496e3c4dcf6d0e0da369a2f6b547d4321a25dc1`
MD5	`b5751e0506b7651bc9563f8d48996275`
BLAKE2b-256	`062fe10aa0edfa2e203fc649a4a52ed9e946decc16824f4740cf61d85a7f9a88`

See more details on using hashes here.

NATTEN 0.21.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Getting started

Release `0.21.6`

License

Citation

Original neighborhood attention paper

Dilated neighborhood attention

GEMM-based and fused neighborhood attention

Generalized neighborhood attention: towards speed-of-light performance

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

NATTEN 0.21.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Getting started

Release 0.21.6

License

Citation

Original neighborhood attention paper

Dilated neighborhood attention

GEMM-based and fused neighborhood attention

Generalized neighborhood attention: towards speed-of-light performance

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

Release `0.21.6`