Skip to main content

NVIDIA CUTLASS Python DSL

Project description

CUTLASS 4.x provides a Python native interfaces for writing high-performance CUDA kernels based on core CUTLASS and CuTe concepts without any performance compromises. This allows for a much smoother learning curve, orders of magnitude faster compile times, native integration with DL frameworks without writing glue code, and much more intuitive metaprogramming that does not require deep C++ expertise.

Overall we envision CUTLASS DSLs as a family of domain-specific languages (DSLs). With the release of 4.0, we are releasing the first of these in CuTe DSL. This is a low level programming model that is fully consistent with CuTe C++ abstractions — exposing core concepts such as layouts, tensors, hardware atoms, and full control over the hardware thread and data hierarchy.

CuTe DSL demonstrates optimal matrix multiply and other linear algebra operations targeting the programmable, high-throughput Tensor Cores implemented by NVIDIA's Ampere, Hopper, and Blackwell architectures.

We believe it will become an indispensable tool for students, researchers, and performance engineers alike — flattening the learning curve of GPU programming, rapidly prototyping kernel designs, and bringing optimized solutions into production.

CuTe DSL is currently in public beta and will graduate out of beta by end of summer 2025.

For more details please visit CUTLASS Documentation or CUTLASS Github.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nvidia_cutlass_dsl_libs_cu13-4.4.0-cp313-cp313-manylinux_2_28_x86_64.whl (78.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.0-cp313-cp313-manylinux_2_28_aarch64.whl (78.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_cu13-4.4.0-cp312-cp312-manylinux_2_28_x86_64.whl (78.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.0-cp312-cp312-manylinux_2_28_aarch64.whl (78.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_cu13-4.4.0-cp311-cp311-manylinux_2_28_x86_64.whl (78.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.0-cp311-cp311-manylinux_2_28_aarch64.whl (78.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

nvidia_cutlass_dsl_libs_cu13-4.4.0-cp310-cp310-manylinux_2_28_x86_64.whl (78.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

nvidia_cutlass_dsl_libs_cu13-4.4.0-cp310-cp310-manylinux_2_28_aarch64.whl (78.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4f61ec9c05c54789fe04373df4d252cbc8c8d57ccae8c6bf28c0e3e39e38266c
MD5 782f1c103cf368dc1100f2e8fe22d03a
BLAKE2b-256 8f91402985beafc2e44be8734c28fa4c5717484f7aac841caeaba8b3497eb62a

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5fb99a02c93e08c2d4ffb5d7f752f4ec6d250cb2f1017ba0da9edbed3f542003
MD5 67821b374688e8b1e816520d18d266f8
BLAKE2b-256 83cbd34229c1ae00b3960aa2bea0387a0599ec63d8df5c0f998a1dfb4483bac0

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 97e6bdb0f2eaa51ef80af32d4de1ee70709e8a9798627e6554e3258fee6247e5
MD5 1ae3200630f7a825a303c9f80607e4bc
BLAKE2b-256 1c86c32ad83dd4e3b0ea34a0be3f6298b78653e0e519b2a3b39c246089538222

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9f1cec16ddb78cdcc9298bab55bfe4b123f91f1d8a82d28ed2d5257f03665fa0
MD5 056b98a44c7191277888ee3610817c49
BLAKE2b-256 d0d74215b1bf0b89a3bb6f73d5e8ae49cfb5c8f0cbd8c1cc1d46ffe92f82e737

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 60189f941a2337ae60e3acc0ec382fc218886f4902becc9c7106d2e2153e479f
MD5 52a6a695a8b004ff7603507b084c1717
BLAKE2b-256 f931bf2afe639797870d2a0809e9ef82c5989c21128e926fcabbd7056179afad

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7f8142edfdeda4425f41166aba92f8927784323932d0c9478b6acaaa515ee9fa
MD5 296002dcd6c23e204a22de6a1f0f5f80
BLAKE2b-256 69bc4540915db852ec10ab9c3d69144f631b792477640ad2d65c44ed5aa80aba

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 bf98e0edc4c577e7d26597d5df330bb89343f39f1f90d50e42f98527c273b97c
MD5 d2757b41b402df4ca32fa1817354fd28
BLAKE2b-256 2a7980de8423abe24d57058d8e88ca486252040c9ee59352d4451691642d11b1

See more details on using hashes here.

File details

Details for the file nvidia_cutlass_dsl_libs_cu13-4.4.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cutlass_dsl_libs_cu13-4.4.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b33c5dde2c71a8b73e7d860ffe2a981dc840c82c87f201cd3b0d82e3f5e0c843
MD5 e0e7416f4f85b816b13a953c470e1762
BLAKE2b-256 c16ad643adeb6f14943bfbe02304505cc017459e0f8cd4a326b41ec25de6d9a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page