Skip to main content

CUDNN FrontEnd python library

Project description

cuDNN FrontEnd(FE)

cuDNN FE is the modern, open-source entry point to the NVIDIA cuDNN library and high performance open-source kernels. It provides a C++ header-only library and a Python interface to access the powerful cuDNN Graph API and open-source kernels.

🚀 Embracing Open Source

We will begin open-sourcing kernels based on customer needs, with the goal to educate developers and enable them to customize as needed.

We are now shipping OSS kernels, allowing you to inspect, modify, and contribute to the core logic. Check out our latest implementations:

  • GEMM + Amax: Optimized FP8 matrix multiplication with absolute maximum calculation.
  • GEMM + SwiGLU: High-performance implementation of the SwiGLU activation fused with GEMM.
  • Grouped GEMM + SwiGLU: SwiGLU activation fused with Grouped GEMM.
  • Grouped GEMM + dSwiglu: dSwiglu activation fused with Grouped GEMM.
  • NSA: Native Sparse attention as described in the Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention.

Key Features

  • Unified Graph API: Create reusable, persistent cudnn_frontend::graph::Graph objects to describe complex subgraphs.
  • Ease of Use: Simplified C++ and Python bindings (via pybind11) that abstract away the boilerplate of the backend API.
  • Performance: Built-in autotuning and support for the latest NVIDIA GPU architectures.

Installation

🐍 Python

The easiest way to get started is via pip:

pip install nvidia_cudnn_frontend

Requirements:

  • Python 3.8+
  • NVIDIA driver and CUDA Toolkit

⚙️ C++ (Header Only)

Since the C++ API is header-only, integration is seamless. Simply include the header in your compilation unit:

#include <cudnn_frontend.h>

Ensure your include path points to the include/ directory of this repository.

Building from Source

If you want to build the Python bindings from source or run the C++ samples:

1. Dependencies

  • python-dev (e.g., apt-get install python-dev)
  • Dependencies listed in requirements.txt (pip install -r requirements.txt)

2. Python Source Build

pip install -v git+https://github.com/NVIDIA/cudnn-frontend.git

Environment variables CUDAToolkit_ROOT and CUDNN_PATH can be used to override default paths.

3. C++ Samples Build

mkdir build && cd build
cmake -DCUDNN_PATH=/path/to/cudnn -DCUDAToolkit_ROOT=/path/to/cuda ../
cmake --build . -j16
./bin/samples

Documentation & Examples

  • Developer Guide: Official NVIDIA Documentation
  • C++ Samples: See samples/cpp for comprehensive usage examples.
  • Python Samples: See samples/python for pythonic implementations.

🤝 Contributing

We strictly welcome contributions! Whether you are fixing a bug, improving documentation, or optimizing one of our new OSS kernels, your help makes cuDNN better for everyone.

  1. Check the Contribution Guide for details.
  2. Fork the repo and create your branch.
  3. Submit a Pull Request.

Debugging

To view the execution flow and debug issues, you can enable logging via environment variables:

# Log to stdout
export CUDNN_FRONTEND_LOG_INFO=1
export CUDNN_FRONTEND_LOG_FILE=stdout

# Log to a file
export CUDNN_FRONTEND_LOG_INFO=1
export CUDNN_FRONTEND_LOG_FILE=execution_log.txt

Logging Levels:

  • CUDNN_FRONTEND_LOG_INFO=0: No logging
  • CUDNN_FRONTEND_LOG_INFO=1: Full logging with tensor dumps
  • CUDNN_FRONTEND_LOG_INFO=10: Basic logging (safe for CUDA graph capture)

Alternatively, you can control logging programmatically via cudnn_frontend::isLoggingEnabled().

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nvidia_cudnn_frontend-1.20.0-cp314-cp314-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.14Windows x86-64

nvidia_cudnn_frontend-1.20.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

nvidia_cudnn_frontend-1.20.0-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl (2.4 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ ARM64manylinux: glibc 2.28+ ARM64

nvidia_cudnn_frontend-1.20.0-cp313-cp313-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.13Windows x86-64

nvidia_cudnn_frontend-1.20.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

nvidia_cudnn_frontend-1.20.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl (2.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ ARM64manylinux: glibc 2.28+ ARM64

nvidia_cudnn_frontend-1.20.0-cp312-cp312-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.12Windows x86-64

nvidia_cudnn_frontend-1.20.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

nvidia_cudnn_frontend-1.20.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl (2.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ ARM64manylinux: glibc 2.28+ ARM64

nvidia_cudnn_frontend-1.20.0-cp311-cp311-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.11Windows x86-64

nvidia_cudnn_frontend-1.20.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

nvidia_cudnn_frontend-1.20.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl (2.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ ARM64manylinux: glibc 2.28+ ARM64

nvidia_cudnn_frontend-1.20.0-cp310-cp310-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.10Windows x86-64

nvidia_cudnn_frontend-1.20.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

nvidia_cudnn_frontend-1.20.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl (2.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ ARM64manylinux: glibc 2.28+ ARM64

nvidia_cudnn_frontend-1.20.0-cp39-cp39-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.9Windows x86-64

nvidia_cudnn_frontend-1.20.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

nvidia_cudnn_frontend-1.20.0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl (2.4 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.27+ ARM64manylinux: glibc 2.28+ ARM64

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 3f596e54398efab24727fc47291c61f969051f37e57e186ffe0fb6df06db19fd
MD5 e5ed5e54b837c9149913524abb56d109
BLAKE2b-256 ceaf7110cea67a8cc8f3cd129cead952f5d50078c8bb99cf35e9f78c74a27097

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f4da0e9ed299843abdcccdde73392577809403d4ef2ad26b4335a3eaee42423f
MD5 e75ed6939055f95d8250a89a1f0b6161
BLAKE2b-256 79a2dd2a75942b0311a50bfef3173b240695a5ebdbcbd3c5154d8f333ef6dac6

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6a1cf3e86664fb64e4752d3936d9cebd0afa6c4b5f6ccde19b6ee4d65fcd9d17
MD5 0df17944ef16e3c78f6b5ce419b90515
BLAKE2b-256 f9a0d2634d910257e6827d178dcebdf109f7f2bd8003659675dffc82fa101077

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 64e5c21853732a2f6ecf031d95d100656514d43fd2260f64266b5f8536f46434
MD5 77de8d4ac774dfbabaf9ab7730d31090
BLAKE2b-256 908fcba72a4deb5168bba97d0094dbfe05591a12bc9cc9432bbfd0c107ddca33

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f317548e700f74c167fa4988de5f0ac06931820e4d0c35b5c7dfe629dd191be4
MD5 3e9ef0c78edba5ef19e92794771d18d9
BLAKE2b-256 2d6fa9f5df2e003ce6f57b6e609e323fc13379a0f7966d2e044de4ceb87ec4b4

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 651fdc9a61b0a4456b557d5f82fab72739b0a6ee61384a4cb23767191e2640cd
MD5 b997b4925124efd4fc78761567e6b4ab
BLAKE2b-256 d726e5a309fe92ad67f2dc1ea85b2615f40db6c19f6a7b36b40036d57ae23a66

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 f2449b0cfc547688e27f975c6ad5101257ae86df0315a80f28af78995adf55b6
MD5 d8da76428f91f8f93f8faf637a371fa6
BLAKE2b-256 cc03d2d725c9c6eb04cd4a3216a7d1a37ab825d2ae8822b79a78b458ab703607

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ce50afe3d1efda07f52e8df5e992f33e92dbb443d0e61e2de703ad5762edc53c
MD5 5133499d9443490c76a9cd89f434a15f
BLAKE2b-256 aa83ee43fc097f475367f1ff5d5e3e1d8191d253f486cdd502d13600759fb845

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 bb891643598ac7b3734b82e5a459cbf778e467ebf7a5b586840003fb66df0ef3
MD5 51500155c8a4d860771efe371e4f2661
BLAKE2b-256 0eeb22b4cad479206a3824edf494582e19fc4a291b9c14febdb859e56b82c03f

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 9415c1f41ff84d2712a6ab55a87e06e5d934d05af6b45adaa709fc07e85eb32f
MD5 106f879ec3763bb27d4fc619cab2f3c7
BLAKE2b-256 ee65ee9a687fcf68996216ab1d36b63ac7d3ce0b3821abd9a45c31833389975e

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5e1101a7fb810c62fd52a2a3beeeda85ea611e49ae18844044e63b1ea31a7b23
MD5 687dc3f22e15248b4547dc7c8b340127
BLAKE2b-256 693e2cae8081e1e926689eeffb91cd44e18424d8405121a05d66a489ddb9b760

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c6a1b246a0bc70553424c7656c637823c73f7d98cca5a58db26f39e1207d2085
MD5 614c56d8497da58166a3ac7ce31db94e
BLAKE2b-256 b88bf660f8e4e771738688668057f84353e55450eb9b85e52f01cfb905783a94

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 60b2cf44c76caed4a9538ad450ba7ce13982ee70d834d4cdc86c9fe7704e9e8b
MD5 c711edc3a757b4c687f4d7d4cee06500
BLAKE2b-256 565d5702807c0b668ecfbabaad05ab22c46482df96de4c871095bef084ba14e4

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a9a339fcf1280ac9d06ecc53bc96275a40987f1a567299f31ecdc9ac77ce05bc
MD5 71381301250c447e160faa8413422986
BLAKE2b-256 95a2a10595e8e91c6536f4a9d2764b55779bea32657dcd591e0c2b506da3111b

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 559748a44e2c7358e99ba353d467498054675d5cc6ca68969c8f100c449434ce
MD5 8bb776abd065f7f746726a8f4f555300
BLAKE2b-256 5b4b3a84fe4e3450220da1f9f006d226db351accfc76444e9b5cc540deec8f09

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 09c97efc8d94bc451643861568babcb494f645b5785f6b53dcf09da78193b98d
MD5 55548c93c78c6839b98de7c256223fca
BLAKE2b-256 131a10f895dd75091c3793e0cf0b6e54d0ff2c43e30c0783651ba65c6c96f5db

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c94312cf1ec799199146e42097b9d4862e44bb3a17ebbcaee256c05b1cf45036
MD5 132aced5716e7fa1aca24d723720ead2
BLAKE2b-256 4ffd1f9df7948566a84a2a8efeb21c652edaa79b175caf2ce9abaea29b646e7a

See more details on using hashes here.

File details

Details for the file nvidia_cudnn_frontend-1.20.0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cudnn_frontend-1.20.0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 d81790d2c2b221ba9ccdcd213a5e405097c7179a937cb62c18d0d9127bb77055
MD5 cc7ab7b7a9251f54ffbd98bc269fd496
BLAKE2b-256 2a2a58d3416b4cc622193b7bde7c61734d36f8114656c1c4a3c4ca59e4e94564

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page