Skip to main content

NVIDIA CUTLASS Python DSL

Project description

CUTLASS 4.x provides a Python native interfaces for writing high-performance CUDA kernels based on core CUTLASS and CuTe concepts without any performance compromises. This allows for a much smoother learning curve, orders of magnitude faster compile times, native integration with DL frameworks without writing glue code, and much more intuitive metaprogramming that does not require deep C++ expertise.

Overall we envision CUTLASS DSLs as a family of domain-specific languages (DSLs). With the release of 4.0, we are releasing the first of these in CuTe DSL. This is a low level programming model that is fully consistent with CuTe C++ abstractions — exposing core concepts such as layouts, tensors, hardware atoms, and full control over the hardware thread and data hierarchy.

CuTe DSL demonstrates optimal matrix multiply and other linear algebra operations targeting the programmable, high-throughput Tensor Cores implemented by NVIDIA's Ampere, Hopper, and Blackwell architectures.

We believe it will become an indispensable tool for students, researchers, and performance engineers alike — flattening the learning curve of GPU programming, rapidly prototyping kernel designs, and bringing optimized solutions into production.

CuTe DSL is currently in public beta and will graduate out of beta by end of summer 2025.

For more details please visit CUTLASS Documentation or CUTLASS Github.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp313-cp313-manylinux_2_28_x86_64.whl (78.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp313-cp313-manylinux_2_28_aarch64.whl (78.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp312-cp312-manylinux_2_28_x86_64.whl (78.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp312-cp312-manylinux_2_28_aarch64.whl (78.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp311-cp311-manylinux_2_28_x86_64.whl (78.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp311-cp311-manylinux_2_28_aarch64.whl (78.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp310-cp310-manylinux_2_28_x86_64.whl (78.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp310-cp310-manylinux_2_28_aarch64.whl (78.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 962bfdc784ea27f6051d2064adf5dad8fe12ec1c197858d0345f457e17a3f4a0
MD5 b8d1bb447f36ef2f9eb64f3ca2deb27b
BLAKE2b-256 dae4eeb211eff8e4379245a2d4a707513159aae01caa7fe48d1adac5d19008d9

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a382e03349be92f815d9827cbe3b77e42758cc9df783c28b9f914662ca379992
MD5 cad5bd40ee3a2dc6860f601c66fcdf3b
BLAKE2b-256 b388db80d7c5bb0bd95f670740f6a32bda201e14bdce80adefb9dffd9f8bb0ba

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 af819669fd96b2b94728eb96cdb10fab93f68f7296eb522e6447520bfe6f078e
MD5 751c604727efb7a178ab1141565bd3d7
BLAKE2b-256 a220b35af920f6b897fad398173996957dd280cf7f93c2a6d0297394c9f713b0

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8311423849eb77dbed2eecba44fb0ed8317bc8a70daa0ed321cb958d81c8a4f0
MD5 1e33bf5b63e2752a0d57b92510ec715c
BLAKE2b-256 144c49a81cb22b9c68951d491a0f6db1215e80c7688f602b138e6d2f13ff9f57

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 cadb3d1c5b39b5e24bf5cba609879bacaf79acb4e3304f34d7434acee9d37e08
MD5 ad3d370bce2d6b74d407ab34fd0bccbd
BLAKE2b-256 d08b16e8ebbe271523f30b3d1f00a57635cf4e0002b0b28a5f35f8990c670b2a

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a4a37bdb60105136917d278fce6314da1574a32bf650b6c8ca6e395f1e0ab919
MD5 ece092f3361bbe9f9ac9a17eea3e51c2
BLAKE2b-256 c9579759b30a5e3ac2691133efc6a4e5c7cfa887cf3c945d2188b3b815cc4730

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9c145fdcdf627fa62f918680ced0f320cf1a2e4614f912e42bd1ac4612209581
MD5 845c24b7b9174b57fcd7a098a814aca5
BLAKE2b-256 0f48a60cdf326fa92e104bc068d36933e33b789a395b1b2327e556fb13eee93a

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0.dev1-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5641566720ee79f79d38852ef60ca76e3d9146d035fa88b901a687b54845c29a
MD5 119f0fbd335ecb064b8c2d1fe14275e0
BLAKE2b-256 21cb0d02b3a029e08db8291ea61e5c590f7f846beb36a7d1436cc162c6f7f096

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page