CUda Matrix Multiply library
Project description
cumm
CUda Matrix Multiply library.
cumm
is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I develop pccm, use python as meta programming language, to replace c++ template meta programming.
Now pccm
become a foundational framework of cumm
and my other c++ project such as spconv.
cumm
also contains a python asyncio-based gemm simulator that share same meta program with CUDA code, enable gemm visualization and easy debug experience.
Install
Prebuilt
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for linux (manylinux).
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for windows 10/11.
We will offer prebuilts for CUDA versions supported by latest pytorch release. For example, pytorch 1.9 support cuda 10.2 and 11.1, so we support them too.
pip install cumm-cu102
for CUDA 10.2
pip install cumm-cu111
for CUDA 11.1
pip install cumm-cu113
for CUDA 11.3
pip install cumm-cu114
for CUDA 11.4
Build from source
Linux
- install build-essential, install CUDA
- run
export CUMM_DISABLE_JIT="1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Windows 10/11
- install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA
- set powershell script execution policy
- start a new powershell, run
tools/msvc_setup.ps1
- run
$Env:CUMM_DISABLE_JIT = "1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Note
The work is done when the author is an employee at Tusimple.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for cumm_cu114-0.2.2-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e9e148127281f53bc20d603a0b1956b3b21b0f4a776e88d4d5caed2acf5dfbf |
|
MD5 | a658ec19441f8ba879dcb8a1d3406b98 |
|
BLAKE2b-256 | 1f83f798c3c9edadd0580ddb6954ab3a18e33124c3120027782a93a47f138153 |
Hashes for cumm_cu114-0.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fdf6c2dbd4e8274ac350f74a62e157311b013bb947e6bbd6cd87814857bdf4f |
|
MD5 | 456ea7c07b94f51d0add97fdd75913fc |
|
BLAKE2b-256 | d38b3acac29708c6cbd9e464884708e68673d1fca3f997325c2e7e63e09fd256 |
Hashes for cumm_cu114-0.2.2-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e506bc5e2e1055eefd15f077c7b66dc8d91059678947e3ce3586ffbc6d833a41 |
|
MD5 | cc9c04485a923cd8c886bbc7f60e2443 |
|
BLAKE2b-256 | f29bca547fbaa38e892e5e37cf0c0d900ff19de465e55bdba6012b83e4c9b4fa |
Hashes for cumm_cu114-0.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0c8665c01b2d4992e7851db377a3245f6f3ffbb5b2f6a80c901d0b60c7966c43 |
|
MD5 | d6756e0d21639fe99487a7ab7e134e63 |
|
BLAKE2b-256 | aaab9a310e2e27fd056ed474dae8717389c2932a90c0525f989ced9e18568f98 |
Hashes for cumm_cu114-0.2.2-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 22381bd01a806e480733c96d5f43334acdc75caf6f3c5e531fcfdbcc1b2b5664 |
|
MD5 | d5189359d7da5431d9f06b242ebea0d1 |
|
BLAKE2b-256 | 779acbb88d8e04766f4a0e8b95ac46290c3eea16e7533a684c1db3ec987c4ab3 |
Hashes for cumm_cu114-0.2.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa78b3567f19284fd523f99356931145c55b1862bafb230d20b31dee84960f0a |
|
MD5 | 60ea9d889be907bbf0230d7a2427efd5 |
|
BLAKE2b-256 | 14c70da735933ad03a3e013bd9c6a2555a8f923ca102fd559fdc9ea5d57195fb |
Hashes for cumm_cu114-0.2.2-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 012ade66ee72742e7d4c3c593c19687cbb76e4daf9b444f20870b46101220d26 |
|
MD5 | fa1b5d71b43da4204beff5572a319a78 |
|
BLAKE2b-256 | 03fa1c24c1db268e6798976ec3a190050c1cca2e2dd8acdfa5ffe99c10fdb454 |
Hashes for cumm_cu114-0.2.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9c4561d900b8998fdff33433f9dc931ed7e37786cdd6aaca60fd91cf9ac0704 |
|
MD5 | 28bf22438b5c840bf0ef870489f23b71 |
|
BLAKE2b-256 | 3cfd8e196c2345272b82cf4fcfa655b2af9d6b207cafc4a22be2319af0ecf8a5 |
Hashes for cumm_cu114-0.2.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 044778111b97ed59cad38873446dcad15249664fd670bb35c008b98861090f6f |
|
MD5 | a5eaad533d753d72c29da159b09b30e2 |
|
BLAKE2b-256 | a09976784c1f98857ecc039619c8b24833af83374a3b65bde87bb7f7662efda2 |