CUda Matrix Multiply library
Project description
cumm
CUda Matrix Multiply library.
cumm
is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I develop pccm, use python as meta programming language, to replace c++ template meta programming.
Now pccm
become a foundational framework of cumm
and my other c++ project such as spconv.
cumm
also contains a python asyncio-based gemm simulator that share same meta program with CUDA code, enable gemm visualization and easy debug experience.
Install
Prebuilt
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for linux (manylinux).
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for windows 10/11.
We will offer prebuilts for CUDA versions supported by latest pytorch release. For example, pytorch 1.9 support cuda 10.2 and 11.1, so we support them too.
pip install cumm-cu102
for CUDA 10.2
pip install cumm-cu111
for CUDA 11.1
pip install cumm-cu113
for CUDA 11.3
pip install cumm-cu114
for CUDA 11.4
Build from source
Linux
- install build-essential, install CUDA
- run
export CUMM_DISABLE_JIT="1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Windows 10/11
- install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA
- set powershell script execution policy
- start a new powershell, run
tools/msvc_setup.ps1
- run
$Env:CUMM_DISABLE_JIT = "1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Note
The work is done when the author is an employee at Tusimple.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for cumm-0.2.3-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35260634d7543196b8bcfcb20e81b470c848c9de25af5fa227f8c2f3ba29132d |
|
MD5 | 5676609876d28268ddf3cd5cf6f43839 |
|
BLAKE2b-256 | 4f13291f213bf5ef198d281d53bed0a19189e28e964586755c11c33bedf4e3a4 |
Hashes for cumm-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 73129702581170df82db469da5ab06f024dc994f817c54ddb330c2b1d7249bbb |
|
MD5 | f7fe9f8d0d8165e1c3ac25fa7786f9ea |
|
BLAKE2b-256 | fbd05957b934561ed388b4bee683d48f11904a319238009d90562ffe68a67712 |
Hashes for cumm-0.2.3-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a8fd023eb43274078b82c964c85c0f751ecfd2bdecebf8ac3cee82d590c800a |
|
MD5 | 50b6c877bbb934db7ad338c77ebb9240 |
|
BLAKE2b-256 | c3189c45767a1f56d9547b6f518cec6c1aaec600b0cd9c975f51e97accac0ea4 |
Hashes for cumm-0.2.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c180d571e4da8113fdc8d0c899d23c0f478c29e4292850b53d744e4c2c1ef5ff |
|
MD5 | 2bde432240bf7fa47dc38c24b0648e60 |
|
BLAKE2b-256 | df99a46265ce8d580c5d8f8a8bd26209630f4435551e40a932c67ed26383fcdb |
Hashes for cumm-0.2.3-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e21d035edccf6e2f6019946383f1fc9280f766b13333c527a5a08253fdeff7c3 |
|
MD5 | 5ead59f4c6f8807cd420a63553400812 |
|
BLAKE2b-256 | 77dc28cfdee9acc993872309b1f064ec38d0acc44d9a37ca1d9a69edc49fd03b |
Hashes for cumm-0.2.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 65c413e8a1b4c94716c5b19b98e51b787a3d6aba366737b91d59695abe81dc57 |
|
MD5 | fdb5fe63fbb63133e72e3acb610252cb |
|
BLAKE2b-256 | 663f52109113cd85e977af82eceb7692e0ca2e5b6525bc661ca7430a60a060c8 |
Hashes for cumm-0.2.3-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b21ce2e894edfe7ac1d244ea48dfdc5713f5cf2f1296dd0937ccfadcf22d5f41 |
|
MD5 | 939cb2631aa37df605904c3cf70ccd94 |
|
BLAKE2b-256 | 236675aa7d088d1930f9289a571026737c7b4e3704950264ab19f9e948df6a1b |
Hashes for cumm-0.2.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35eb91d3a9176d4854778505cb47b0937efd7d0aa73b4a22a4b7ab83e4b7cbab |
|
MD5 | 6acd289ae0170753b7aba504f7081a25 |
|
BLAKE2b-256 | d2c4448e027dcaeec3472b1fb35a1f7b64f0cbf475691cc768c491e9609e2c7b |
Hashes for cumm-0.2.3-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a75b5dee4f7d50d959c191d85c4e22a9e485c670427657f9ac417f11920b427 |
|
MD5 | 7ceed280c22b28f6fd1d2770eed452bc |
|
BLAKE2b-256 | 83176e97d450ad971556a70f6a0e1f9dcecf3a35b2b2d95079b6c14a84fe5581 |