Skip to main content

CUDA cffi bindings and helper classes

Project description

cuda4py
=========

Python cffi CUDA bindings and helper classes.

Tested with Python 2.7, Python 3.4 and PyPy on Linux and Windows with CUDA 6.5.

To compile kernel code written in C++, nvcc should be in PATH and
exported functions should be marked as extern "C"
(for Windows, cl.exe should be in PATH also).
Functions in plain PTX can be used without nvcc.

To use CUBLAS, libcublas.so (cublas64_65.dll) should be present.

Not all CUDA api is currently covered.

To install the module run:
```bash
python setup.py install
```
or just copy src/cuda4py to any place where python
interpreter will be able to find it.

To run the tests, execute:

for Python 2.7:
```bash
PYTHONPATH=src nosetests -w tests
```

for Python 3.4:
```bash
PYTHONPATH=src nosetests3 -w tests
```

for PyPy:
```bash
PYTHONPATH=src pypy tests/test_api.py
PYTHONPATH=src pypy tests/test_cublas.py
```

Currently, PyPy numpy support may be incomplete,
so tests which use numpy arrays may fail.

Example usage:

```python
import cuda4py as cu
import logging
import numpy


if __name__ == "__main__":
logging.basicConfig(level=logging.DEBUG)
ctx = cu.Devices().create_some_context()
module = cu.Module(
ctx, source=
"""
extern "C"
__global__ void test(const float *a, const float *b,
float *c, const float k) {
size_t i = blockDim.x * blockIdx.x + threadIdx.x;
c[i] = (a[i] + b[i]) * k;
}
""")
test = cu.Function(module, "test")
a = numpy.arange(1000000, dtype=numpy.float32)
b = numpy.arange(1000000, dtype=numpy.float32)
c = numpy.empty(1000000, dtype=numpy.float32)
k = numpy.array([0.5], dtype=numpy.float32)
a_buf = cu.MemAlloc(ctx, a.nbytes)
b_buf = cu.MemAlloc(ctx, b.nbytes)
c_buf = cu.MemAlloc(ctx, c.nbytes)
a_buf.to_device_async(a)
b_buf.to_device_async(b)
test.set_args(a_buf, b_buf, c_buf, k)
test((a.size, 1, 1))
c_buf.to_host(c)
max_diff = numpy.fabs(c - (a + b) * k[0]).max()
logging.info("max_diff = %.6f", max_diff)
```

Released under Simplified BSD License.
Copyright (c) 2014, Samsung Electronics Co.,Ltd.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cuda4py-1.0.7.tar.gz (19.5 kB view details)

Uploaded Source

File details

Details for the file cuda4py-1.0.7.tar.gz.

File metadata

  • Download URL: cuda4py-1.0.7.tar.gz
  • Upload date:
  • Size: 19.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cuda4py-1.0.7.tar.gz
Algorithm Hash digest
SHA256 95483fcc754fb5451088347eb21c33f871ab3e9a4ceb1fb17032a3b23b41abc1
MD5 3bba772ecf4e3728b1d18832a7d75a25
BLAKE2b-256 168fc9b5914d9fb7899ca36c3f826afedc643d169bf330fc79361a02b1628ca4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page