Skip to main content

An easy way to run OpenCL kernel files

Project description

OpenCL Kernel Python Wrapper

github badge readthedocs GitHub release (with filter) PyPI - Version PyPI - Downloads license GitHub Repo stars PyPI - Python Version

Install

Requirements

  • OpenCL GPU hardware
  • numpy
  • cmake(if compile from source)

Install from wheel

pip install pyoclk

or download wheel from release and install

Compile from source

Clone this repo

clone by http

git clone --recursive https://github.com/jinmingyi1998/opencl_kernels.git

with ssh

git clone --recursive git@github.com:jinmingyi1998/opencl_kernels.git

Install

cd opencl_kernels
python setup.py install

DO NOT move this directory after install

Usage

Kernel File:

a file named add.cl

kernel void add(global float*a, global float*out, int int_arg, float float_arg){
    int x = get_global_id(0);
    if(x==0){
        printf(" accept int arg: %d, accept float arg: %f\n",int_arg,float_arg);
    }
    out[x] = a[x] * float_arg + int_arg;    
}

Python Code

OOP Style

import numpy as np
import oclk

a = np.random.rand(100, 100).reshape([10, -1])
a = np.ascontiguousarray(a, np.float32)
out = np.zeros(a.shape)
out = np.ascontiguousarray(out, np.float32)

runner = oclk.Runner()
runner.load_kernel("add.cl", "add", "")

timer = oclk.TimerArgs(
    enable=True,
    warmup=10,
    repeat=50,
    name='add_kernel'
)
runner.run(
    kernel_name="add",
    input=[
        {"name": "a", "value": a, },
        {"name": "out", "value": out, },
        {"name": "int_arg", "value": 1, "type": "int"},
        {"name": "float_arg", "value": 12.34}
    ],
    output=['out'],
    local_work_size=[1, 1],
    global_work_size=a.shape,
    timer=timer
)
# check result
a = a.reshape([-1])
out = out.reshape([-1])
print(a[:8])
print(out[:8])

Kernel Benchmark

  1. write a config like bench_add.yaml
  2. run python -m oclk benchmark -f examples/bench_add.yaml

Example

python -m oclk benchmark -f examples/bench_add.yaml                          

output:

[Timer bench_add.add] [CNT: 1] [AVG: 0.539ms] [STDEV 0.000ms] [TOTAL 0.539ms]
[Timer bench_add.add_constant] [CNT: 1] [AVG: 0.576ms] [STDEV 0.000ms] [TOTAL 0.576ms]
[Timer bench_add.add_batch] [CNT: 1] [AVG: 0.150ms] [STDEV 0.000ms] [TOTAL 0.150ms]
python -m oclk benchmark -f examples/bench_add.yaml -s table

output:

             benchmark results             
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ timer name             ┃   avg time(ms) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ bench_add.add          │ 0.538525390625 │
│ bench_add.add_constant │ 0.581396484375 │
│ bench_add.add_batch    │ 0.149169921875 │
└────────────────────────┴────────────────┘
python -m oclk benchmark -f examples/bench_add.yaml -s json -o bench_add.json

output to json file bench_add.json

[
  {
    "name": "bench_add.add",
    "time(ms)": 0.54248046875
  },
  {
    "name": "bench_add.add_constant",
    "time(ms)": 0.5767089843750001
  },
  {
    "name": "bench_add.add_batch",
    "time(ms)": 0.15048828125000002
  }
]

Kernel Tune

  1. given a OpenCL kernel file add.cl
  2. run python -m oclk new tune add, then generate a new file tune_add.py
  3. edit tune_add.py
  4. run python -m oclk tune -f tune_add.py -o add_tune_result.json
  5. results are stored in add_tune_result.json

Example

python -m oclk tune -f examples/tune/tune_add.py -k 3

then output output.json

[
  {
    "name": [
      "examples.tune.tune_add",
      "AddTuner"
    ],
    "k": 3,
    "topk_results": [
      {
        "kwargs": {
          "local_work_size": [
            512
          ],
          "vector_size": 4,
          "tile_size": 4,
          "method": "naive"
        },
        "time_ms": 0.67691162109375
      },
      {
        "kwargs": {
          "local_work_size": [
            128
          ],
          "vector_size": 4,
          "tile_size": 4,
          "method": "naive"
        },
        "time_ms": 0.6769140625
      },
      {
        "kwargs": {
          "local_work_size": [
            64
          ],
          "vector_size": 4,
          "tile_size": 4,
          "method": "naive"
        },
        "time_ms": 0.677001953125
      }
    ]
  }
]

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyoclk-1.2.3-cp312-cp312-manylinux_2_28_x86_64.whl (788.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

pyoclk-1.2.3-cp311-cp311-manylinux_2_28_x86_64.whl (787.7 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

pyoclk-1.2.3-cp310-cp310-manylinux_2_28_x86_64.whl (786.1 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

pyoclk-1.2.3-cp39-cp39-manylinux_2_28_x86_64.whl (786.5 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.28+ x86-64

pyoclk-1.2.3-cp38-cp38-manylinux_2_28_x86_64.whl (786.0 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.28+ x86-64

pyoclk-1.2.3-cp37-cp37m-manylinux_2_28_x86_64.whl (789.8 kB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.28+ x86-64

File details

Details for the file pyoclk-1.2.3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pyoclk-1.2.3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9d9f7faae5d97a00679d573dab80cbab5ca3478b4073fa1f4e0f36c3286210a5
MD5 4b437f117adf39ee9025c38d8a4b4902
BLAKE2b-256 60fd16159f533e506307596245d0c9c5da2ee383c84037ad08db56854e7cb9c6

See more details on using hashes here.

File details

Details for the file pyoclk-1.2.3-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pyoclk-1.2.3-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2253ef742a0979a519ec948bff17ae271d58a87257400269443725b5667809a4
MD5 e26ca8e31b60eae1a99bff87e0037af8
BLAKE2b-256 115cd630dea8bee2a06ae78ae6e4df67034f869f48104e642ee11a244c8717ed

See more details on using hashes here.

File details

Details for the file pyoclk-1.2.3-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pyoclk-1.2.3-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a4f1a2a52f35a7468721a1e539f77df86d2479e59811b9ba2a2e48e2d8e24481
MD5 d2cfaf5c167c308e587eec134201fd46
BLAKE2b-256 9d0049b8e39120fe4d11e35993d723f22b39076197d307c2e212e856a5d6bbe8

See more details on using hashes here.

File details

Details for the file pyoclk-1.2.3-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pyoclk-1.2.3-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 350cfeab5646ba415ad39aebb67952d1e1ae58ceda4e903d080af5bee7ffa703
MD5 d970e198ef3dbea74724f98214d3c039
BLAKE2b-256 b9340302e75d07847737ae907ee9ba75d709644154f162e83f02e59475d73b26

See more details on using hashes here.

File details

Details for the file pyoclk-1.2.3-cp38-cp38-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pyoclk-1.2.3-cp38-cp38-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0650f6fb557b84c5b8db069865aa3c00c81848bdb6205f4f7f3a302e811c67d4
MD5 f96dbecf3f671632d1e302a40a202490
BLAKE2b-256 710e826717c2c1912e3ef029fe6d394c4cb4141c5a0d468aed5e028b18c29ae6

See more details on using hashes here.

File details

Details for the file pyoclk-1.2.3-cp37-cp37m-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pyoclk-1.2.3-cp37-cp37m-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d29b02a424dd6620a06150cec28e33b090158ebd8bbc967ce2d6a88e474d348f
MD5 6ccfb4ff145c3340b041ef27e0aefa38
BLAKE2b-256 f3f4eb6c2f4e38060d364dc9c4d7e5a0644a45c56a4ea75364fc3811fa61b309

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page