CUda Matrix Multiply library
Project description
cumm
CUda Matrix Multiply library.
cumm
is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I develop pccm, use python as meta programming language, to replace c++ template meta programming.
Now pccm
become a foundational framework of cumm
and my other c++ project such as spconv.
cumm
also contains a python asyncio-based gemm simulator that share same meta program with CUDA code, enable gemm visualization and easy debug experience.
Install
Prebuilt
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for linux (manylinux).
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for windows 10/11.
We will offer prebuilts for CUDA versions supported by latest pytorch release. For example, pytorch 1.9 support cuda 10.2 and 11.1, so we support them too.
pip install cumm-cu102
for CUDA 10.2
pip install cumm-cu111
for CUDA 11.1
pip install cumm-cu113
for CUDA 11.3
pip install cumm-cu114
for CUDA 11.4
Build from source
Linux
- install build-essential, install CUDA
- run
export CUMM_DISABLE_JIT="1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Windows 10/11
- install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA
- set powershell script execution policy
- start a new powershell, run
tools/msvc_setup.ps1
- run
$Env:CUMM_DISABLE_JIT = "1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Note
The work is done when the author is an employee at Tusimple.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for cumm_cu111-0.2.3-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | af828866730374a1c0317f7e31198c7ee45f542537619b52d46d99098ea2c348 |
|
MD5 | 9ded23f680fa8a81708f0ab1a0af5a7d |
|
BLAKE2b-256 | 8442f3a0b1e58b335e8a5b918d2ac4e9191a73efd52b9000deef0de9aa21b086 |
Hashes for cumm_cu111-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ed4d6de596ad9a996c2fe3f93eb01edcf93e12279727a03b9e00702b8cea38f1 |
|
MD5 | 984b9079fee2212089d96111147c1d07 |
|
BLAKE2b-256 | f1b0d0d43be7f8f6a244ae510fd8ef3bc27da33097424a58f8666e2bab229b4d |
Hashes for cumm_cu111-0.2.3-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 84cac28cbdf4db948036bde00833600ac21eaee85832bd1a95d8d4b07f68a64e |
|
MD5 | a9b731579a272773aef11397507e553a |
|
BLAKE2b-256 | 779c003dcda1a0e7037a6f7455336f17e69ab03501a6776a3ea615ab9e68bfa2 |
Hashes for cumm_cu111-0.2.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40c2aee0befb85d33f00e7cd46f180342b17b4a8a5125f034ab1d3980bbdb74f |
|
MD5 | a438e9ae232fe29fbece58246475687d |
|
BLAKE2b-256 | d3996d6acbcc55079ee2a54b4be20335dc12ee85078030dd430d1bf3367eda25 |
Hashes for cumm_cu111-0.2.3-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30b90d9c2a087d3ee93787dda22062c64d8148d3ea672599b4f9aa4a5d9417ac |
|
MD5 | 45e65ba6df058a7b6f4df16b41634bc8 |
|
BLAKE2b-256 | deaa6544ef16b8301e83e6b04908f6ce5281c2ad12904d9d9d802fd555d1e478 |
Hashes for cumm_cu111-0.2.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e7c21d6012adb5913b74eff30c1ed5f08e19c7720eab47bfd701b40dd61202fb |
|
MD5 | 2e7a8eeeb1f4409ab4a5f5bee4518da0 |
|
BLAKE2b-256 | e999a0f7f0c9627445180af5ee12703ab1015f9ff784c54c020a902c3e7ee9e9 |
Hashes for cumm_cu111-0.2.3-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6cc0ff918169c0211aa899304cfbbf9f06f96c4cda59a67c26fe38314c9f6e6e |
|
MD5 | 5faf5b8535e0e439008f7cda51e4c786 |
|
BLAKE2b-256 | 6791fa29db7b8e852c136b968ecdfc84b360ebfb6c1fe40fad8fb3bfced84e09 |
Hashes for cumm_cu111-0.2.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b2601c79527a713e2a6888bd1779cd11b426b24f23cd834e0ba34afd40cc6653 |
|
MD5 | b267be15f2265733c57705988f70cdfc |
|
BLAKE2b-256 | dede26b6ed1387e6197154f6fa3961c97be9335cdc309e089c944f41ee6b9907 |
Hashes for cumm_cu111-0.2.3-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86ba72c130a2b7b7716826509ca0320742ec1de37803aab8f2cbe9688cdef841 |
|
MD5 | 0d625361412ab5c23416f727d636a70d |
|
BLAKE2b-256 | 1a1a2e1afc173b2e6966848fcdafa1b2c9e9e51d67772c62c079ebc3116976fb |