CUDA accelerated correlation and sum reduction functions
Project description
CUDA Kernels
A Python package providing CUDA-accelerated functions for autocorrelation and sum reduction operations, with automatic CPU fallback when CUDA is not available.
Installation
From PyPI (Recommended)
pip install cuda-kernels
From GitHub
pip install git+https://github.com/AstuteFern/cuda-toolkit.git
From Source
git clone https://github.com/AstuteFern/cuda-toolkit.git
cd cuda-toolkit
pip install .
Requirements
- Python 3.6+
- NumPy
Optional (for CUDA acceleration)
- NVIDIA GPU with CUDA support
- CUDA Toolkit (version 11.0+)
Note: The package works on any system. If CUDA is not available, it automatically uses optimized CPU implementations.
Quick Start
import numpy as np
from cuda_kernels import autocorrelation, reduction_sum
# Create test data
data = np.random.randn(1000).astype(np.float32)
# Compute autocorrelation (automatically uses CUDA if available)
acf = autocorrelation(data, max_lag=50)
print(f"Autocorrelation shape: {acf.shape}")
# Compute sum reduction
total = reduction_sum(data)
print(f"Sum: {total}")
API Reference
autocorrelation(data, max_lag=None, force_cpu=False)
Compute autocorrelation of a time series.
Parameters:
data(numpy.ndarray): Input 1D array (converted to float32)max_lag(int, optional): Maximum lag to compute. Default: len(data)-1force_cpu(bool): Force CPU implementation. Default: False
Returns:
numpy.ndarray: Autocorrelation values for lags [0, max_lag)
reduction_sum(data, force_cpu=False)
Compute sum of array elements.
Parameters:
data(numpy.ndarray): Input 1D array (converted to float32)force_cpu(bool): Force CPU implementation. Default: False
Returns:
float: Sum of all elements
Examples
Basic Usage
import numpy as np
from cuda_kernels import autocorrelation, reduction_sum
# Example 1: Autocorrelation
signal = np.sin(np.linspace(0, 4*np.pi, 1000)).astype(np.float32)
acf = autocorrelation(signal, max_lag=100)
# Example 2: Sum reduction
data = np.array([1, 2, 3, 4, 5], dtype=np.float32)
total = reduction_sum(data) # Returns 15.0
Checking CUDA Status
import sys
autocorr_module = sys.modules['cuda_kernels.autocorrelation']
reduction_module = sys.modules['cuda_kernels.reduction']
print(f"CUDA available: {autocorr_module._cuda_available}")
Force CPU Mode
# Useful for testing or when you want consistent behavior
cpu_result = reduction_sum(data, force_cpu=True)
Performance
- With CUDA: Significant speedup for large arrays (10K+ elements)
- CPU Fallback: Optimized NumPy implementations, still efficient for most use cases
- Automatic Detection: No configuration needed, works out of the box
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cuda_kernels-0.1.1.tar.gz.
File metadata
- Download URL: cuda_kernels-0.1.1.tar.gz
- Upload date:
- Size: 8.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f699e076d8bd56688b2a6a29ccfba17850e285d2fad0d38d00381a2a403c7fe
|
|
| MD5 |
37b3b7d07bd40fc3a4bc41c75c4cd527
|
|
| BLAKE2b-256 |
b7d54d81eeb778c7bf71b8970a79998dbae2cf1c0120db33619632dabe77b87c
|
File details
Details for the file cuda_kernels-0.1.1-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: cuda_kernels-0.1.1-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 7.6 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d31735fd4994240eaefd627e319b4eb157f234821c2eb431e90043a612e31859
|
|
| MD5 |
c14d590be86857dba5e2464a5d893998
|
|
| BLAKE2b-256 |
cda170b9f1065838b0f47a1531ff1c0eb10b979cd1a473d62b680419fe52ba27
|