CUda Matrix Multiply library
Project description
cumm
CUda Matrix Multiply library.
cumm
is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I develop pccm, use python as meta programming language, to replace c++ template meta programming.
Now pccm
become a foundational framework of cumm
and my other c++ project such as spconv.
cumm
also contains a python asyncio-based gemm simulator that share same meta program with CUDA code, enable gemm visualization and easy debug experience.
BREAKING CHANGES
- 0.3.1: tv::DType enum value changed, this will affect all binary code of tv::Tensor user. you must recompile all code if upgrade to cumm >= 0.3.1.
News
- Ampere feature support (by EvernightAurora)
Install
Prebuilt
We offer python 3.7-3.11 and cuda 10.2/11.3/11.4/11.7/12.0 prebuilt binaries for linux (manylinux).
We offer python 3.7-3.11 and cuda 10.2/11.3/11.4/11.7/12.0 prebuilt binaries for windows 10/11.
pip install cumm
for CPU-only
pip install cumm-cu102
for CUDA 10.2
pip install cumm-cu113
for CUDA 11.3
pip install cumm-cu114
for CUDA 11.4
pip install cumm-cu117
for CUDA 11.7
pip install cumm-cu120
for CUDA 12.0
Build from source for development (JIT, recommend for develop)
WARNING Use code in tags!!! code in main branch may contain bugs.
The c++ code will be built automatically when you change c++ code in project.
Linux
- uninstall cumm installed by pip. you must ensure no "cumm" exists in
pip list | grep cumm
- install build-essential, install CUDA
git clone https://github.com/FindDefinition/cumm
,cd ./cumm
,git checkout tags/<tag_name>
,pip install -e .
- in python,
import cumm
and wait for build finish.
Windows
- uninstall spconv and cumm installed by pip. you must ensure no "cumm" exists in
pip list | grep cumm
- install visual studio 2019 or newer. make sure C++ development component is installed. install CUDA
- set powershell script execution policy
- start a new powershell, run
tools/msvc_setup.ps1
git clone https://github.com/FindDefinition/cumm
,cd ./cumm
,git checkout tags/<tag_name>
,pip install -e .
- in python,
import cumm
and wait for build finish.
Build wheel from source
WARNING Use code in tags!!! code in main branch may contain bugs.
WARNING: If CUMM_CUDA_VERSION
is set with a CUDA version, following steps will create a wheel named "cumm-cuxxx", not "cumm", this means you must use cumm-cuxxx
in dependency of your project which depend on cumm, not cumm
. If CUMM_CUDA_VERSION
isn't set, cumm
will always built with CUDA, so the CUDA must exists in your system. The wheel name will be cumm
even if it is built with cuda.
Linux
It's recommend to build Linux packages in official build docker. Build with CUDA support don't need a real GPU.
Build in Official Docker
- select a cuda version. available: CUDA 11.1, 11.3, 11.4, 11.5, 12.0
- (Example for CUDA 11.4)
git clone https://github.com/FindDefinition/cumm
,cd ./cumm
,docker run --rm -e PLAT=manylinux2014_x86_64 -e CUMM_CUDA_VERSION=114 -v `pwd`:/io scrin/manylinux2014-cuda:cu114-devel-1.0.0 bash -c "source /etc/bashrc && /io/tools/build-wheels.sh"
Build in your environment
- install build-essential, install CUDA
- set env for installed cuda version. for example,
export CUMM_CUDA_VERSION="11.4"
. If you want to build CPU-only, runexport CUMM_CUDA_VERSION=""
. IfCUMM_CUDA_VERSION
isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will becumm
, otherwisecumm-cuxxx
- run
export CUMM_DISABLE_JIT="1"
- run
python setup.py bdist_wheel
+pip install dists/xxx.whl
Windows 10/11
- install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA
- set powershell script execution policy
- start a new powershell, run
tools/msvc_setup.ps1
- set env for installed cuda version. for example,
$Env:CUMM_CUDA_VERSION = "11.4"
. If you want to build CPU-only, run$Env:CUMM_CUDA_VERSION = ""
. . IfCUMM_CUDA_VERSION
isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will becumm
, otherwisecumm-cuxxx
- run
$Env:CUMM_DISABLE_JIT = "1"
- run
python setup.py bdist_wheel
+pip install dists/xxx.whl
Contributers
- EvernightAurora: add ampere feature.
Note
The work is done when the author is an employee at Tusimple.
LICENSE
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for cumm_cu117-0.3.4-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a68155ac6f53ae0f76d490e4c7ff37a886637af48acc9b2d882ed0c43706156 |
|
MD5 | 1d3540a3865ec681497ddf6171b3cf27 |
|
BLAKE2b-256 | eae78ee4a3e46a80768120e6476958598139e88fab5a805a2f34965e670e7c0e |
Hashes for cumm_cu117-0.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 17937dc451d13d4f1f1371092d5136d1fa3b098cd39bbe8ab1e1f48496aaf37f |
|
MD5 | b6cf9fe7bd78b6770806647ed9dc090a |
|
BLAKE2b-256 | 414921f624d5e1d894a6e63296ae0a6b4e64116a3d3b7134fde2999b1f17013e |
Hashes for cumm_cu117-0.3.4-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc177a93338997338ad183b45a6615db4b8f0814feb61d341fdd2525a2e6b254 |
|
MD5 | dfcf68a192d1d56df9eb2a0842d75346 |
|
BLAKE2b-256 | aa69119b25296e002e8c5baba10790a6ad2356749f5321470c76102d8ae78df6 |
Hashes for cumm_cu117-0.3.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 701c6b1ad0c3ed091ba794b88882e3ee6774843801ae05b930e4ed595f43aa7c |
|
MD5 | 7b79abd8c9757bc67d5992ddbf0c50bc |
|
BLAKE2b-256 | 9458e43c64c5f51474c8465539181f075011b544ca5887332ab42e72629bd58b |
Hashes for cumm_cu117-0.3.4-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6b89fda29d060d7a55a7b826189aad6316145f055e5644df380a5321d1ca9bd |
|
MD5 | b05297a666e95704b26900c14ad46cbd |
|
BLAKE2b-256 | cbb98fe3ee1380e7280c71bc451c30147256904e9c73df63e8bca08db385b8fd |
Hashes for cumm_cu117-0.3.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d4138dda4bee5530a867fca51d4a10ee7faefca51aa32e9d7acb5aafe0c9d15 |
|
MD5 | 4cc2851948a73f758b57b0c1068d369a |
|
BLAKE2b-256 | 8e2855c8efa5dedeec54c50c981a4b6721012510bc11b4f683137d4efa5d4d9f |
Hashes for cumm_cu117-0.3.4-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24d874e8af8971e9adc5f595da8bfb49260418ca8060ed8f0bc7488b1aa859c0 |
|
MD5 | 640a9997d9f941ce41e6d0a58c7e41a1 |
|
BLAKE2b-256 | 206887ad76a4f0e220332ba57f2ce1f61bf50de13e6818320ca34938d02a93f0 |
Hashes for cumm_cu117-0.3.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | add8f70a9dd1cd0310025ec118cdd116686b2943282f3d2ad84a22e9111c741f |
|
MD5 | 5c249f607be41af4e4044734817a9797 |
|
BLAKE2b-256 | 524f370301bf4cb41f32446da0b2205cb028ae5f60f663889e6e6fe6c40d9827 |
Hashes for cumm_cu117-0.3.4-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | abf8f84bc3cf0c3ac3f10d2f9b6429f432dddcb7032da484513fe7d853a39354 |
|
MD5 | f4eb0aed4ca96b8f6e54bd7908aae0ca |
|
BLAKE2b-256 | ae2692d778ac181189362d8b53b6d9c6ac95002bf23f258645ea626c9bd6a96c |
Hashes for cumm_cu117-0.3.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa9632f1f083b626c8eed70d99551505bb43c86fad9f7990cc5cc257a10afb2d |
|
MD5 | 80a34cc9ece3b094ca6d69c747502665 |
|
BLAKE2b-256 | 823e6a283a68ac62028749f180a7182ac91198efbc1d480728bc2ca9206bb616 |