Skip to main content

NVIDIA CUTLASS Python DSL

Project description

CUTLASS 4.x provides a Python native interfaces for writing high-performance CUDA kernels based on core CUTLASS and CuTe concepts without any performance compromises. This allows for a much smoother learning curve, orders of magnitude faster compile times, native integration with DL frameworks without writing glue code, and much more intuitive metaprogramming that does not require deep C++ expertise.

Overall we envision CUTLASS DSLs as a family of domain-specific languages (DSLs). With the release of 4.0, we are releasing the first of these in CuTe DSL. This is a low level programming model that is fully consistent with CuTe C++ abstractions — exposing core concepts such as layouts, tensors, hardware atoms, and full control over the hardware thread and data hierarchy.

CuTe DSL demonstrates optimal matrix multiply and other linear algebra operations targeting the programmable, high-throughput Tensor Cores implemented by NVIDIA's Ampere, Hopper, and Blackwell architectures.

We believe it will become an indispensable tool for students, researchers, and performance engineers alike — flattening the learning curve of GPU programming, rapidly prototyping kernel designs, and bringing optimized solutions into production.

CuTe DSL is currently in public beta and will graduate out of beta by end of summer 2025.

For more details please visit CUTLASS Documentation or CUTLASS Github.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp313-cp313-manylinux_2_28_x86_64.whl (74.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp313-cp313-manylinux_2_28_aarch64.whl (75.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp312-cp312-manylinux_2_28_x86_64.whl (74.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp312-cp312-manylinux_2_28_aarch64.whl (75.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp311-cp311-manylinux_2_28_x86_64.whl (74.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp311-cp311-manylinux_2_28_aarch64.whl (75.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp310-cp310-manylinux_2_28_x86_64.whl (74.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp310-cp310-manylinux_2_28_aarch64.whl (75.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 196fde5a2411aa9b497625977b71cf1f17e9c80a98f56ace9313f901f10bf5de
MD5 a43d939467d5cc2b6c056f447b7c8aed
BLAKE2b-256 348cb9d22253223f33de35baec6e76e786f1dc2055b3a03a3ee62af8a34d99e4

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 34eb0329587ac61fd3b9b2378ab9c1280bde69d740336ecf4fcb53249e371c62
MD5 9f03910442826455d8b1a704874bd68f
BLAKE2b-256 bc15d778704dc46bed1c3c2a01b28d079653b856c4c1628d7dc16ed5b3cdc6c9

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 492a8d53127ad3008c939cbfbdf809ebb164272bb6ccac4dbe20461cad41df6f
MD5 ebcb520a0134787eb29befbc1ab24f17
BLAKE2b-256 0bd6e9a11c14827cf36e339dcc6903e715c03307739c0c1e44f270143f9d23e5

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 bea93ac1064067cb85a26584cf486f35e929082beaacf24fb7201f420017e055
MD5 d872529e73f6b257060df29209fea198
BLAKE2b-256 07c23882dff5398f57ed8eea1cac898b7484a4482503622edd76e69a9c723170

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0da11c56935ac9034ecaed5282b43dcdebb22d96f08f3328e8472343442f95cb
MD5 a50e8aa0939b7eddb2f242a504fe2722
BLAKE2b-256 5606fddc523dc6ffea1f76614383998744c9c1adab1a7ee17b0ebb49d1d63bdf

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8556f9397e9e3290c9f46cb4bee30b77a1af2bfc4347fa8f5a6777152310017d
MD5 1fd977ac66148cb1b5ca420882254456
BLAKE2b-256 e343af92db2167ee4eb30620641bf14ec9aebfae8ff27f1cfc5a1f2552a4d855

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 149c82bc0e2c90e50df9e347ab363b3bec4895090a5a2c11c84d550880e83fe2
MD5 41dc01788b11dc45110a80f2ddd639dd
BLAKE2b-256 defab57ad248255106bd64d3f1a6253382e6b045318b07b85769192a090e98e7

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_base-4.4.0.dev1-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5fe61f152f23509f4b2bf4e527af5ca328f6d6609039d7f485c2f3427b9dd4c4
MD5 0dc758efbecc02a31cb791adf9c25b42
BLAKE2b-256 57ddbdaf649d36a98df77285335af9164bdc7c09e1858e5c53f680586e643a70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page