kernel-tuner

A simple CUDA/OpenCL kernel tuner in Python

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
License
- OSI Approved :: Apache Software License
Natural Language
- English
Operating System
- POSIX :: Linux
Programming Language
- Python :: 2.7
- Python :: 3.5
Topic

Project description

The goal of this project is to provide a - as simple as possible - tool for tuning CUDA and OpenCL kernels. This implies that any CUDA or OpenCL kernel can be tuned without requiring extensive changes to the original kernel code.

A very common problem in GPU programming is that some combination of thread block dimensions and other kernel parameters, like tiling or unrolling factors, results in dramatically better performance than other kernel configurations. The goal of auto-tuning is to automate the process of finding the best performing configuration for a given device.

This kernel tuner aims that you can directly use the tuned kernel without introducing any new dependencies. The tuned kernels can afterwards be used independently of the programming environment, whether that is using C/C++/Java/Fortran or Python doesn’t matter.

The kernel_tuner module currently only contains main one function which is called tune_kernel to which you pass at least the kernel name, a string containing the kernel code, the problem size, a list of kernel function arguments, and a dictionary of tunable parameters. There are also a lot of optional parameters, for a complete list see the full documentation of tune_kernel.

Documentation

The full documentation is available here.

Installation

To install from PyPi:

pip install kernel_tuner

To install from the source (and get the tutorial and examples):

clone the repository
- With a GitHub account: git clone git@github.com:benvanwerkhoven/kernel_tuner.git
- Without GitHub account: git clone https://github.com/benvanwerkhoven/kernel_tuner.git
change into the top-level directory
- cd kernel_tuner
install using
- pip install .

Dependencies

Python 2.7 or Python 3.5

PyCuda and/or PyOpenCL (https://mathema.tician.de/software/)

To tune CUDA kernels
- Make sure you have the CUDA Toolkit installed
- You can install PyCuda using pip install pycuda
To tune OpenCL kernels
- Make sure you have an OpenCL compiler for your intended OpenCL platform
- You can install PyOpenCL using pip install pyopencl

Example usage

The following shows a simple example for tuning a CUDA kernel:

kernel_string = """
__global__ void vector_add(float *c, float *a, float *b, int n) {
    int i = blockIdx.x * block_size_x + threadIdx.x;
    if (i<n) {
        c[i] = a[i] + b[i];
    }
}
"""

size = 10000000

a = numpy.random.randn(size).astype(numpy.float32)
b = numpy.random.randn(size).astype(numpy.float32)
c = numpy.zeros_like(b)
n = numpy.int32(size)
args = [c, a, b, n]

tune_params = dict()
tune_params["block_size_x"] = [128+64*i for i in range(15)]

tune_kernel("vector_add", kernel_string, size, args, tune_params)

The exact same Python code can be used to tune an OpenCL kernel:

kernel_string = """
__kernel void vector_add(__global float *c, __global float *a, __global float *b, int n) {
    int i = get_global_id(0);
    if (i<n) {
        c[i] = a[i] + b[i];
    }
}
"""

Or even just a C function, see the example here.

You can find these and many - more extensive - example codes, in the examples directory.

See the full documentation for several highly detailed tutorial-style explanations of example kernels and the scripts to tune them.

Tuning host and kernel code

It is also possible to tune for combinations of tunable parameters in both host and kernel code. This allows for a number of powerfull things, such as tuning the number of streams for a kernel that uses CUDA Streams or OpenCL Command Queues to overlap transfers between host and device with kernel execution. This can be done in combination with tuning the parameters inside the kernel code. See the convolution_streams example code and the documentation for a detailed explanation of the kernel tuner Python script.

Correctness verification

Optionally, you can let the kernel tuner verify the output of every kernel it compiles and benchmarks, by passing an answer list. This list matches the list of arguments to the kernel, but contains the expected output of the kernel. Input arguments are replaced with None.

answer = [a+b, None, None]  # the order matches the arguments (in args) to the kernel
tune_kernel("vector_add", kernel_string, size, args, tune_params, answer=answer)

Contribution guide

The kernel tuner follows the Google Python style guide, with Sphinxdoc docstrings for module public functions. If you want to contribute to the project please fork it, create a branch including your addition, and create a pull request.

The tests use relative imports and can be run directly after making changes to the code. To run all tests use nosetests in the main directory. To run the examples after code changes, you need to run pip install --upgrade . in the main directory. Documentation is generated by typing make html in the doc directory, the contents of doc/build/html/ should then be copied to sphinxdoc directory of the gh-pages branch.

Before creating a pull request please ensure the following:

You have written unit tests to test your additions and all unit tests pass
The examples still work and produce the same (or better) results
The code is compatible with both Python 2.7 and Python 3.5
An entry about the change or addition is created in CHANGELOG.md

Contributing authors so far:

Ben van Werkhoven
Berend Weel

Citation

A scientific paper about the Kernel Tuner is in preparation, in the meantime please cite the Kernel Tuner as follows:

@misc{
  author = {Ben van Werkhoven},
  title = {Kernel Tuner: A simple CUDA/OpenCL Kernel Tuner in Python},
  year = {2017}
}

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
License
- OSI Approved :: Apache Software License
Natural Language
- English
Operating System
- POSIX :: Linux
Programming Language
- Python :: 2.7
- Python :: 3.5
Topic

Release history Release notifications | RSS feed

1.0

Apr 4, 2024

1.0.0b6 pre-release

Nov 8, 2023

1.0.0b5 pre-release

Nov 1, 2023

1.0.0b4 pre-release

Oct 22, 2023

1.0.0b3 pre-release

Oct 12, 2023

1.0.0b2 pre-release

Oct 11, 2023

1.0.0b1 pre-release

Oct 11, 2023

0.4.5

Jun 1, 2023

0.4.4

Mar 9, 2023

0.4.3

Oct 19, 2022

0.4.2

May 23, 2022

0.4.1

Sep 10, 2021

0.4.0

Apr 9, 2021

0.3.2

Nov 4, 2020

0.3.1

Jun 18, 2020

0.3.0

Feb 14, 2020

0.2.0

Nov 16, 2018

0.1.9

Apr 18, 2018

0.1.8

Nov 23, 2017

0.1.7

Nov 10, 2017

0.1.6

Aug 24, 2017

0.1.5

Jul 21, 2017

0.1.4

Jun 14, 2017

0.1.3

Apr 6, 2017

This version

0.1.2

Mar 29, 2017

0.1.1

Feb 10, 2017

0.1.0

Nov 2, 2016

0.1.0rc0 pre-release

Nov 8, 2016

0.1.0b0 pre-release

Nov 2, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kernel_tuner-0.1.2.tar.gz (26.8 kB view hashes)

Uploaded Mar 29, 2017 Source

Hashes for kernel_tuner-0.1.2.tar.gz

Hashes for kernel_tuner-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`9d0d6d084a29a73fb5e45284ab74c30e9784a3958717f7f6b261163826163dc9`
MD5	`795eafb07aeadc2e9702cebffdd53803`
BLAKE2b-256	`b794684fe811444f2d3c2e48b158334881117a25b0adc66b9a971b5f53fa52c4`

kernel-tuner 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Documentation

Installation

Dependencies

Example usage

Tuning host and kernel code

Correctness verification

Contribution guide

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

kernel-tuner 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Documentation

Installation

Dependencies

Example usage

Tuning host and kernel code

Correctness verification

Contribution guide

Citation

Related work

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution