CUda Matrix Multiply library
Project description
cumm
CUda Matrix Multiply library.
cumm
is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I develop pccm, use python as meta programming language, to replace c++ template meta programming.
Now pccm
become a foundational framework of cumm
and my other c++ project such as spconv.
cumm
also contains a python asyncio-based gemm simulator that share same meta program with CUDA code, enable gemm visualization and easy debug experience.
Install
Prebuilt
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for linux (manylinux).
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for windows 10/11.
We will offer prebuilts for CUDA versions supported by latest pytorch release. For example, pytorch 1.9 support cuda 10.2 and 11.1, so we support them too.
pip install cumm-cu102
for CUDA 10.2
pip install cumm-cu111
for CUDA 11.1
pip install cumm-cu113
for CUDA 11.3
pip install cumm-cu114
for CUDA 11.4
Build from source
Linux
- install build-essential, install CUDA
- run
export CUMM_DISABLE_JIT="1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Windows 10/11
- install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA
- set powershell script execution policy
- start a new powershell, run
tools/msvc_setup.ps1
- run
$Env:CUMM_DISABLE_JIT = "1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Note
The work is done when the author is an employee at Tusimple.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for cumm_cu114-0.2.3-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ee6bd058582a56ba2ca90c2e513408cb6e2d06a39401b5c67376815b63120a2 |
|
MD5 | 40fe42195c70f47f34478721419a4e9c |
|
BLAKE2b-256 | 23014baa035a7f919c53c15a58f11f8a42092c62885abdfb9cbb34d407bb122b |
Hashes for cumm_cu114-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2764b2a05bf8e2ecdff470591e802a0143788f895ef525d190d752effad2699 |
|
MD5 | c3ccb1be6b81b808039865eb332cd89e |
|
BLAKE2b-256 | 264cf6be2db6df3dbfc945b82f49a6388dd02a35dcc514834b7fc1a5ef8ffb68 |
Hashes for cumm_cu114-0.2.3-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bd291157317f80b88a29b08bb130bae4ddde9dbc076832c6df709719f5a7b19e |
|
MD5 | 73475570cc9ad661f6388cf6654e0007 |
|
BLAKE2b-256 | 5a32e19ed36d17e391d512db9d879b00eb9e11df581a367c6a27c70a22bd8ea7 |
Hashes for cumm_cu114-0.2.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 60df817bc13c03285cb78bc1732f2d250cbe4bd7ec9e5b28da5f2c610ddcd36c |
|
MD5 | 7dcf852c139beddf0def039f54474a00 |
|
BLAKE2b-256 | 551cc57fa580cc84940e47424176ce8c0b553a676db22a1beead24ac52f50563 |
Hashes for cumm_cu114-0.2.3-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 230eba3c7ea3f1cfb72bd1b9bed34cabe3a87073b855658028d4f8233d394e8c |
|
MD5 | b0889726a50d2d1da471ed14f283c8ca |
|
BLAKE2b-256 | 0fbadda9324d8f2153609a0f08fa9a844aa08d6ab669904c9c868aff7a00f386 |
Hashes for cumm_cu114-0.2.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 695e38e5315c1e880efda12849f0b5577955f97e6029161962a31f13cc79d323 |
|
MD5 | d1a7e97715c2c401222ada7ff1148527 |
|
BLAKE2b-256 | 0c54b8aedda1aeb3ab044c84306c6b5b36aa4497448b00298cb6bbfa8edaf786 |
Hashes for cumm_cu114-0.2.3-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40b62c23de31f1243320920b04e0eb1d71dfc6f976559ceeac384c713ffc08a9 |
|
MD5 | e730c6e7947edcc85f621ec4a8940d70 |
|
BLAKE2b-256 | cf8e11f8ea2a6ae8f88778c83e7f71b8d8ff6d9769b55d3bece52d3a0cbadb2e |
Hashes for cumm_cu114-0.2.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1688df1684e867aa9d0d60c52b5982d2d79d619cf1106c507c55943c2080a600 |
|
MD5 | 3b05ffdbb488f48cbadeaa36debf5a71 |
|
BLAKE2b-256 | 4af8dc9d35af6882801ee995cf5b4edd7815f2a4315fbc757641f83468abee79 |
Hashes for cumm_cu114-0.2.3-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d01aa560ebbc33a96dfd96a3d0a516fee5c8371b6a48c99bc119c4b2f5f7cda2 |
|
MD5 | 133a61d8631e100d07a03bb132ff584f |
|
BLAKE2b-256 | 65244cf77e89693026cc395998a335d144c7a2b1f3e020c97f7b2dcb351c8e11 |