CUda Matrix Multiply library
Project description
cumm
CUda Matrix Multiply library.
cumm
is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I develop pccm, use python as meta programming language, to replace c++ template meta programming.
Now pccm
become a foundational framework of cumm
and my other c++ project such as spconv.
cumm
also contains a python asyncio-based gemm simulator that share same meta program with CUDA code, enable gemm visualization and easy debug experience.
Install
Prebuilt
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for linux (manylinux).
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for windows 10/11.
We will offer prebuilts for CUDA versions supported by latest pytorch release. For example, pytorch 1.9 support cuda 10.2 and 11.1, so we support them too.
pip install cumm-cu102
for CUDA 10.2
pip install cumm-cu111
for CUDA 11.1
pip install cumm-cu113
for CUDA 11.3
pip install cumm-cu114
for CUDA 11.4
Build from source
Linux
- install build-essential, install CUDA
- run
export CUMM_DISABLE_JIT="1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Windows 10/11
- install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA
- set powershell script execution policy
- start a new powershell, run
tools/msvc_setup.ps1
- run
$Env:CUMM_DISABLE_JIT = "1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Note
The work is done when the author is an employee at Tusimple.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for cumm-0.2.2-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0bdf82c13814eae0ce62b86a5426337ca45e654325c312772525043d84388ebd |
|
MD5 | d87481b5fcb31aa2fed70930d11b240b |
|
BLAKE2b-256 | a12c3896a06bdb9bf2ec996a2968e4570dbaf5409ea9264777680a6a3a011f8a |
Hashes for cumm-0.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e7a08dee2cd5cba3dbfb7be8ee685f24c2673184cad54762b5571c347830930d |
|
MD5 | 2660ba22debda565a8c18d92dfa4723c |
|
BLAKE2b-256 | ac312443d221d72b63acf2b145d8dcac81666a18561765e77f02fab2fe7bf2d4 |
Hashes for cumm-0.2.2-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc5052915b932f60b3a9b4620c0e8d19d94fb56ebf9167c0779ef325dc106a34 |
|
MD5 | e0439b36e6344d05072d11b1d424ad39 |
|
BLAKE2b-256 | e27cd5f6f330ebce3d3685a01caad285fce7545401ca4626fc7d5dda0d958f86 |
Hashes for cumm-0.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13b892fe5ff5eea53b6aa2d3d6102dbaa17904621beb69370189d8128f927ff6 |
|
MD5 | 9b6265c64bca0a81efccba2d3d4ab6dd |
|
BLAKE2b-256 | 7f3c485de264653a854ee4d921fd3992589e0de5897c1805d7c31cc56bad6d9b |
Hashes for cumm-0.2.2-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6425ff9388f30424648a739ac1a492943d9fff008973fa6e7a39fec2ce4bc54c |
|
MD5 | 4212cbaf8bf0df15d2d1b2d393bfd068 |
|
BLAKE2b-256 | 7fd293cb652823ebb21c201fcaf925b0925355527dccdb6dd4b5c651c007898a |
Hashes for cumm-0.2.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 490c9c70478564ba621bb16591bbe8d42892096ab9569900f3ffc922fa3817a1 |
|
MD5 | 03ba4e2ccd6f2af7755dbc9a488b4646 |
|
BLAKE2b-256 | 7f58e5097e21b8a9380647afe4f9d73241ccdc46479745dc08fd74553d205a94 |
Hashes for cumm-0.2.2-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fa1153feca23e7d18e43c0ed2c44952f6a933903c2b42f46690f7dc0b9f13e9 |
|
MD5 | 02a37f63cfdf2566317eef43179f5ac2 |
|
BLAKE2b-256 | 881f6faf4ce7207ee36c5eba3d0012eec3a8ef93f5064fad814f44a968758f19 |
Hashes for cumm-0.2.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf228140807e50be7eb977bdcca2515213852f64440973d69ed5ce90496388a4 |
|
MD5 | 230a1a6b729f6f7a84ea21e80328523b |
|
BLAKE2b-256 | 9c5c9ee924ee455b4c5276865f280189ecb611e9615946c08ce9689524087986 |
Hashes for cumm-0.2.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3705e88f915326e36436260bb7b716e402c713ab704a5e7c3885c6990bfc22ef |
|
MD5 | 968971e8a8eb5fd5f60f7485b6f09faa |
|
BLAKE2b-256 | c2c7948a1f9665883b6fa0ed564bdb3dc55dc3f4849c20c0c09cd56fe7eb7d5b |