An easy to use CUDA/OpenCL kernel tuner in Python
Project description
Create optimized GPU applications in any mainstream GPU programming language (CUDA, HIP, OpenCL, OpenACC).
What Kernel Tuner does:
- Works as an external tool to benchmark and optimize GPU kernels in isolation
- Can be used directly on existing kernel code without extensive changes
- Can be used with applications in any host programming language
- Blazing fast search space construction
- More than 20 optimization algorithms to speedup tuning
- Energy measurements and optimizations (power capping, clock frequency tuning)
- ... and much more! For example, caching, output verification, tuning host and device code, user defined metrics, see the full documentation.
Installation
- First, make sure you have your CUDA, OpenCL, or HIP compiler installed
- Then type:
pip install kernel_tuner[cuda]
,pip install kernel_tuner[opencl]
, orpip install kernel_tuner[hip]
- or why not all of them:
pip install kernel_tuner[cuda,opencl,hip]
More information on installation, also for other languages, in the installation guide.
Example
import numpy as np
from kernel_tuner import tune_kernel
kernel_string = """
__global__ void vector_add(float *c, float *a, float *b, int n) {
int i = blockIdx.x * block_size_x + threadIdx.x;
if (i<n) {
c[i] = a[i] + b[i];
}
}
"""
n = np.int32(10000000)
a = np.random.randn(n).astype(np.float32)
b = np.random.randn(n).astype(np.float32)
c = np.zeros_like(a)
args = [c, a, b, n]
tune_params = {"block_size_x": [32, 64, 128, 256, 512]}
tune_kernel("vector_add", kernel_string, n, args, tune_params)
More examples here.
Resources
- Full documentation
- Guides:
- Features & Use cases:
- Kernel Tuner Tutorial slides [PDF], hands-on:
- Energy Efficient GPU Computing tutorial slides [PDF], hands-on:
Kernel Tuner ecosystem
C++ magic to integrate auto-tuned kernels into C++ applications
C++ data types for mixed-precision CUDA kernel programming
Monitor, analyze, and visualize auto-tuning runs
Communication & Contribution
- GitHub Issues: Bug reports, install issues, feature requests, work in progress
- GitHub Discussion group: General questions, Q&A, thoughts
Contributions are welcome! For feature requests, bug reports, or usage problems, please feel free to create an issue. For more extensive contributions, check the contribution guide.
Citation
If you use Kernel Tuner in research or research software, please cite the most relevant among the publications on Kernel Tuner. To refer to the project as a whole, please cite:
@article{kerneltuner,
author = {Ben van Werkhoven},
title = {Kernel Tuner: A search-optimizing GPU code auto-tuner},
journal = {Future Generation Computer Systems},
year = {2019},
volume = {90},
pages = {347-358},
url = {https://www.sciencedirect.com/science/article/pii/S0167739X18313359},
doi = {https://doi.org/10.1016/j.future.2018.08.004}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file kernel_tuner-1.0.tar.gz
.
File metadata
- Download URL: kernel_tuner-1.0.tar.gz
- Upload date:
- Size: 145.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c41ce19a214ff54e9a21b373f32e5a3a70b02c61e32a36a419075e863ff4c4c8 |
|
MD5 | c18a0ce01cebfb5cecb11eb20e0dd79d |
|
BLAKE2b-256 | 0ddc9d317882ae2c139a1c884bdd5e425035a101fbfdf5a8aee42d584b203e03 |
File details
Details for the file kernel_tuner-1.0-py3-none-any.whl
.
File metadata
- Download URL: kernel_tuner-1.0-py3-none-any.whl
- Upload date:
- Size: 140.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 354565469e774bb6f55818cd3a3e5125fbb9276ead12a83cb2fdf1d53a8fb1e7 |
|
MD5 | 0e392207f64b488e325393012b006b82 |
|
BLAKE2b-256 | b3da2dfe5c1dab7a4dfa9a055e57162a678e754ae4ac348e2991c8b32cb11c11 |