CUda Matrix Multiply library
Project description
cumm
CUda Matrix Multiply library.
cumm
is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I develop pccm, use python as meta programming language, to replace c++ template meta programming.
Now pccm
become a foundational framework of cumm
and my other c++ project such as spconv.
cumm
also contains a python asyncio-based gemm simulator that share same meta program with CUDA code, enable gemm visualization and easy debug experience.
Install
Prebuilt
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for linux (manylinux).
We offer python 3.7-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for windows 10/11.
We will offer prebuilts for CUDA versions supported by latest pytorch release. For example, pytorch 1.9 support cuda 10.2 and 11.1, so we support them too.
pip install cumm-cu102
for CUDA 10.2
pip install cumm-cu111
for CUDA 11.1
pip install cumm-cu113
for CUDA 11.3
pip install cumm-cu114
for CUDA 11.4
Build from source
Linux
- install build-essential, install CUDA
- run
export CUMM_DISABLE_JIT="1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Windows 10/11
- install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA
- set powershell script execution policy
- start a new powershell, run
tools/msvc_setup.ps1
- run
$Env:CUMM_DISABLE_JIT = "1"
- run
python setup.py install
/pip install -e .
/python setup.py bdist_wheel
+pip install dists/xxx.whl
Note
The work is done when the author is an employee at Tusimple.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for cumm_cu102-0.2.2-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40ff0216069e4c1bb7a3b9a018855c54d81440cc42a895c3957351cdf77dcbfa |
|
MD5 | 43ff28d33ae3f38503d0f2e0cd50238c |
|
BLAKE2b-256 | 2f5b270854ed9f5e0ba7400b983616292e20dea0dfe1cc227b2a74edd89c2219 |
Hashes for cumm_cu102-0.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27eaa4b59c10c8542d0a213cf30c8f9fe163c012b1813a604322226b8adebff8 |
|
MD5 | f7169a0f4bfa5e7858892a6adbc17569 |
|
BLAKE2b-256 | 3a09881533cff9040627bb8c2330216fca40aa84a9703f3e34008208ca3d796b |
Hashes for cumm_cu102-0.2.2-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d080d262c343ac94ea302e4de3d1b8555a7282706ba1149cc346927549a27455 |
|
MD5 | 65c2808860eccf7f2148525521a580c6 |
|
BLAKE2b-256 | 2ce91e06878250ccc211d61b2305d53cbc9085450adc923a339e29fbc5563aa6 |
Hashes for cumm_cu102-0.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6fef8d0961a571246326ed730a1b8e7c5f104eea345d4d1ec48c12ac948d0368 |
|
MD5 | d9296b37921541d86fded0b8662d2f66 |
|
BLAKE2b-256 | 739d2db5cdd5673a320a4844470702bde74944c3f93bed1707fb409cc1c8a106 |
Hashes for cumm_cu102-0.2.2-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b2172700c0c1b00691ad88bb9764c887a1430bb0266aab62e065126a88e8f5e6 |
|
MD5 | 43e05f3e1ab5046c67545b371053cc3c |
|
BLAKE2b-256 | 211aafd89c63b63b8b7281f15abbcbf6171234961035777c443e85a1122eb297 |
Hashes for cumm_cu102-0.2.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 900a26c231c3ca6132e33240b87ea4ee51f6fc30cef83f07b7c346e3ebeebca2 |
|
MD5 | 8cc4767c7600cf08cc5ad7fb97e6adf1 |
|
BLAKE2b-256 | bf06eb529847ed18e75933f6041ca45a53221a73b5e17246a417f99497b8338b |
Hashes for cumm_cu102-0.2.2-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81db2a5177a4b0b15f4ab3502b067cc3f67a188e92d1e91dd43d0e96e73e8bdd |
|
MD5 | 603f347c3477ef127617fb32f18bb69a |
|
BLAKE2b-256 | f5a54f086945e7c58e1f6fd317447b72b35cda36c754f14758a08d25f90b8576 |
Hashes for cumm_cu102-0.2.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 937fb3d6ee3712643a9d6f27331658a90ece4ca4b27bdd9a5a6cc171d9bdfc0a |
|
MD5 | e3040f56c30bfe98ba724421f77110ed |
|
BLAKE2b-256 | 1a5c2fda69a6af813f799b71949a1bb50b373cd6d8d58c9f5c3692e1dfa5c093 |
Hashes for cumm_cu102-0.2.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e0d1710841def0433c8b0f99aafd79391288aa48ef0f228e2797247edc5eb6f |
|
MD5 | afc0be7428594e7e7300b4db1b82bd99 |
|
BLAKE2b-256 | 07db1f9ad30315fcc6523600b4cef4d7a52676c502eb675b908c7081dc92a7b8 |