Skip to main content

NVIDIA CUTLASS Python DSL

Project description

CUTLASS 4.x provides a Python native interfaces for writing high-performance CUDA kernels based on core CUTLASS and CuTe concepts without any performance compromises. This allows for a much smoother learning curve, orders of magnitude faster compile times, native integration with DL frameworks without writing glue code, and much more intuitive metaprogramming that does not require deep C++ expertise.

Overall we envision CUTLASS DSLs as a family of domain-specific languages (DSLs). With the release of 4.0, we are releasing the first of these in CuTe DSL. This is a low level programming model that is fully consistent with CuTe C++ abstractions — exposing core concepts such as layouts, tensors, hardware atoms, and full control over the hardware thread and data hierarchy.

CuTe DSL demonstrates optimal matrix multiply and other linear algebra operations targeting the programmable, high-throughput Tensor Cores implemented by NVIDIA's Ampere, Hopper, and Blackwell architectures.

We believe it will become an indispensable tool for students, researchers, and performance engineers alike — flattening the learning curve of GPU programming, rapidly prototyping kernel designs, and bringing optimized solutions into production.

CuTe DSL is currently in public beta and will graduate out of beta by end of summer 2025.

For more details please visit CUTLASS Documentation or CUTLASS Github.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nvidia_cutlass_dsl_libs_cu13-4.4.1-cp313-cp313-manylinux_2_28_x86_64.whl (78.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.1-cp313-cp313-manylinux_2_28_aarch64.whl (78.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_cu13-4.4.1-cp312-cp312-manylinux_2_28_x86_64.whl (78.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.1-cp312-cp312-manylinux_2_28_aarch64.whl (78.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_cu13-4.4.1-cp311-cp311-manylinux_2_28_x86_64.whl (78.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.1-cp311-cp311-manylinux_2_28_aarch64.whl (78.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_cu13-4.4.1-cp310-cp310-manylinux_2_28_x86_64.whl (78.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.1-cp310-cp310-manylinux_2_28_aarch64.whl (78.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 054c4f6fff51b22a33f0cf1c65e48ed2015c58d42b2a17d2749a4bad3b9b744c
MD5 a74f20280201e52dce3ba7fb2bb3fa27
BLAKE2b-256 b3474905cac297d5b8e6d657ff0a23c22dd3d13647e2581e8b266a542d6aa93a

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7c3605162408caa605937046149d73610d3355283b1fa3a9559540b2c393db39
MD5 9cb14fb90eb829367619625baede6dc0
BLAKE2b-256 b45f995e2ac5bc2cac43c6f27ec746554f184064e46932cc5318606cc7685b22

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 39feed15c293960afe7beca4d1009d091fbd2d762d921cc0018a7e63e501cac7
MD5 0c08449b2df773a774d7c56f5a60b30d
BLAKE2b-256 8136a8ea89547d3a7a73e1bde28010e744b10f4c866f43532b286b1e62ac50d5

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 09f5327711c740211e26eae3c34be260cf8d5e305204a87c634cb535117e57e8
MD5 cb031331ab4bc8b9459f92e36377c1b5
BLAKE2b-256 ac472e913ff28c76cbce1fc90d18bedd3a8904f9820c6aa0fb3641e8812e2dc0

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 54f82a9e92a5274078f42e1e9f1d22bcb73804f442a026581fad82b679a6fe40
MD5 d4b9d1432612e541fa878ee64cd9ce74
BLAKE2b-256 528c3a64f59d282f1d8077c5c5dc8691bf33673a66ba4abec78ffbb3224c1472

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 23b27545762feb8e3353d1ee4436290a9ae582645083c92d42a17ba30032e626
MD5 cba54ec68a4c86ac9d4e35a47983441b
BLAKE2b-256 177eeb39e3254c0fea886d9861b2df67a4bec527b717ec52ca69d2f70b6ae7e2

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6aa6bf20ab9dfe0ffb386fae0ead3c949108023e2b0b04c3a3df1ac772dce94f
MD5 c5f31fbe5c6eb84966f97c585582ebda
BLAKE2b-256 b1abbf3ab90c487177431f802b12d309d8b832bd7629191ba14f4b4029c17c42

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.1-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.1-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 081c73316bde78a29e721786f4855094f814d4f81e401d81ec92ef509960e3cd
MD5 defdf88e4e0a4e96265c59189db12f26
BLAKE2b-256 56cde2dc622283b47f9aef12417785a2ca22f3984be9c635005537ca9817f217

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page