An easy way to run OpenCL kernel files
Project description
OpenCL Kernel Python Wrapper
Install
Requirements
- OpenCL GPU hardware
- numpy
- cmake(if compile from source)
Install from wheel
pip install pyoclk
or download wheel from release and install
Compile from source
Clone this repo
clone by http
git clone --recursive https://github.com/jinmingyi1998/opencl_kernels.git
with ssh
git clone --recursive git@github.com:jinmingyi1998/opencl_kernels.git
Install
cd opencl_kernels
python setup.py install
DO NOT move this directory after install
Usage
Kernel File:
a file named add.cl
kernel void add(global float*a, global float*out, int int_arg, float float_arg){
int x = get_global_id(0);
if(x==0){
printf(" accept int arg: %d, accept float arg: %f\n",int_arg,float_arg);
}
out[x] = a[x] * float_arg + int_arg;
}
Python Code
OOP Style
import numpy as np
import oclk
a = np.random.rand(100, 100).reshape([10, -1])
a = np.ascontiguousarray(a, np.float32)
out = np.zeros(a.shape)
out = np.ascontiguousarray(out, np.float32)
runner = oclk.Runner()
runner.load_kernel("add.cl", "add", "")
timer = oclk.TimerArgs(
enable=True,
warmup=10,
repeat=50,
name='add_kernel'
)
runner.run(
kernel_name="add",
input=[
{"name": "a", "value": a, },
{"name": "out", "value": out, },
{"name": "int_arg", "value": 1, "type": "int"},
{"name": "float_arg", "value": 12.34}
],
output=['out'],
local_work_size=[1, 1],
global_work_size=a.shape,
timer=timer
)
# check result
a = a.reshape([-1])
out = out.reshape([-1])
print(a[:8])
print(out[:8])
Kernel Benchmark
- write a config like bench_add.yaml
- run
python -m oclk benchmark -f examples/bench_add.yaml
Example
python -m oclk benchmark -f examples/bench_add.yaml
output:
[Timer bench_add.add] [CNT: 1] [AVG: 0.539ms] [STDEV 0.000ms] [TOTAL 0.539ms]
[Timer bench_add.add_constant] [CNT: 1] [AVG: 0.576ms] [STDEV 0.000ms] [TOTAL 0.576ms]
[Timer bench_add.add_batch] [CNT: 1] [AVG: 0.150ms] [STDEV 0.000ms] [TOTAL 0.150ms]
python -m oclk benchmark -f examples/bench_add.yaml -s table
output:
benchmark results
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ timer name ┃ avg time(ms) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ bench_add.add │ 0.538525390625 │
│ bench_add.add_constant │ 0.581396484375 │
│ bench_add.add_batch │ 0.149169921875 │
└────────────────────────┴────────────────┘
python -m oclk benchmark -f examples/bench_add.yaml -s json -o bench_add.json
output to json file bench_add.json
[
{
"name": "bench_add.add",
"time(ms)": 0.54248046875
},
{
"name": "bench_add.add_constant",
"time(ms)": 0.5767089843750001
},
{
"name": "bench_add.add_batch",
"time(ms)": 0.15048828125000002
}
]
Kernel Tune
- given a OpenCL kernel file
add.cl - run
python -m oclk new tune add, then generate a new filetune_add.py - edit
tune_add.py - run
python -m oclk tune -f tune_add.py -o add_tune_result.json - results are stored in
add_tune_result.json
Example
python -m oclk tune -f examples/tune/tune_add.py -k 3
then output output.json
[
{
"name": [
"examples.tune.tune_add",
"AddTuner"
],
"k": 3,
"topk_results": [
{
"kwargs": {
"local_work_size": [
512
],
"vector_size": 4,
"tile_size": 4,
"method": "naive"
},
"time_ms": 0.67691162109375
},
{
"kwargs": {
"local_work_size": [
128
],
"vector_size": 4,
"tile_size": 4,
"method": "naive"
},
"time_ms": 0.6769140625
},
{
"kwargs": {
"local_work_size": [
64
],
"vector_size": 4,
"tile_size": 4,
"method": "naive"
},
"time_ms": 0.677001953125
}
]
}
]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyoclk-1.2.3-cp312-cp312-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: pyoclk-1.2.3-cp312-cp312-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 788.2 kB
- Tags: CPython 3.12, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d9f7faae5d97a00679d573dab80cbab5ca3478b4073fa1f4e0f36c3286210a5
|
|
| MD5 |
4b437f117adf39ee9025c38d8a4b4902
|
|
| BLAKE2b-256 |
60fd16159f533e506307596245d0c9c5da2ee383c84037ad08db56854e7cb9c6
|
File details
Details for the file pyoclk-1.2.3-cp311-cp311-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: pyoclk-1.2.3-cp311-cp311-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 787.7 kB
- Tags: CPython 3.11, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2253ef742a0979a519ec948bff17ae271d58a87257400269443725b5667809a4
|
|
| MD5 |
e26ca8e31b60eae1a99bff87e0037af8
|
|
| BLAKE2b-256 |
115cd630dea8bee2a06ae78ae6e4df67034f869f48104e642ee11a244c8717ed
|
File details
Details for the file pyoclk-1.2.3-cp310-cp310-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: pyoclk-1.2.3-cp310-cp310-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 786.1 kB
- Tags: CPython 3.10, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4f1a2a52f35a7468721a1e539f77df86d2479e59811b9ba2a2e48e2d8e24481
|
|
| MD5 |
d2cfaf5c167c308e587eec134201fd46
|
|
| BLAKE2b-256 |
9d0049b8e39120fe4d11e35993d723f22b39076197d307c2e212e856a5d6bbe8
|
File details
Details for the file pyoclk-1.2.3-cp39-cp39-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: pyoclk-1.2.3-cp39-cp39-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 786.5 kB
- Tags: CPython 3.9, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
350cfeab5646ba415ad39aebb67952d1e1ae58ceda4e903d080af5bee7ffa703
|
|
| MD5 |
d970e198ef3dbea74724f98214d3c039
|
|
| BLAKE2b-256 |
b9340302e75d07847737ae907ee9ba75d709644154f162e83f02e59475d73b26
|
File details
Details for the file pyoclk-1.2.3-cp38-cp38-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: pyoclk-1.2.3-cp38-cp38-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 786.0 kB
- Tags: CPython 3.8, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0650f6fb557b84c5b8db069865aa3c00c81848bdb6205f4f7f3a302e811c67d4
|
|
| MD5 |
f96dbecf3f671632d1e302a40a202490
|
|
| BLAKE2b-256 |
710e826717c2c1912e3ef029fe6d394c4cb4141c5a0d468aed5e028b18c29ae6
|
File details
Details for the file pyoclk-1.2.3-cp37-cp37m-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: pyoclk-1.2.3-cp37-cp37m-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 789.8 kB
- Tags: CPython 3.7m, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d29b02a424dd6620a06150cec28e33b090158ebd8bbc967ce2d6a88e474d348f
|
|
| MD5 |
6ccfb4ff145c3340b041ef27e0aefa38
|
|
| BLAKE2b-256 |
f3f4eb6c2f4e38060d364dc9c4d7e5a0644a45c56a4ea75364fc3811fa61b309
|