Skip to main content

CUDA accelerated correlation and sum reduction functions

Project description

CUDA Kernels

A Python package providing CUDA-accelerated functions for autocorrelation and sum reduction operations, with automatic CPU fallback when CUDA is not available.

Installation

From PyPI (Recommended)

pip install cuda-kernels

From GitHub

pip install git+https://github.com/AstuteFern/cuda-toolkit.git

From Source

git clone https://github.com/AstuteFern/cuda-toolkit.git
cd cuda-toolkit
pip install .

Requirements

  • Python 3.6+
  • NumPy

Optional (for CUDA acceleration)

  • NVIDIA GPU with CUDA support
  • CUDA Toolkit (version 11.0+)

Note: The package works on any system. If CUDA is not available, it automatically uses optimized CPU implementations.

Quick Start

import numpy as np
from cuda_kernels import autocorrelation, reduction_sum

# Create test data
data = np.random.randn(1000).astype(np.float32)

# Compute autocorrelation (automatically uses CUDA if available)
acf = autocorrelation(data, max_lag=50)
print(f"Autocorrelation shape: {acf.shape}")

# Compute sum reduction
total = reduction_sum(data)
print(f"Sum: {total}")

API Reference

autocorrelation(data, max_lag=None, force_cpu=False)

Compute autocorrelation of a time series.

Parameters:

  • data (numpy.ndarray): Input 1D array (converted to float32)
  • max_lag (int, optional): Maximum lag to compute. Default: len(data)-1
  • force_cpu (bool): Force CPU implementation. Default: False

Returns:

  • numpy.ndarray: Autocorrelation values for lags [0, max_lag)

reduction_sum(data, force_cpu=False)

Compute sum of array elements.

Parameters:

  • data (numpy.ndarray): Input 1D array (converted to float32)
  • force_cpu (bool): Force CPU implementation. Default: False

Returns:

  • float: Sum of all elements

Examples

Basic Usage

import numpy as np
from cuda_kernels import autocorrelation, reduction_sum

# Example 1: Autocorrelation
signal = np.sin(np.linspace(0, 4*np.pi, 1000)).astype(np.float32)
acf = autocorrelation(signal, max_lag=100)

# Example 2: Sum reduction
data = np.array([1, 2, 3, 4, 5], dtype=np.float32)
total = reduction_sum(data)  # Returns 15.0

Checking CUDA Status

import sys
autocorr_module = sys.modules['cuda_kernels.autocorrelation']
reduction_module = sys.modules['cuda_kernels.reduction']

print(f"CUDA available: {autocorr_module._cuda_available}")

Force CPU Mode

# Useful for testing or when you want consistent behavior
cpu_result = reduction_sum(data, force_cpu=True)

Performance

  • With CUDA: Significant speedup for large arrays (10K+ elements)
  • CPU Fallback: Optimized NumPy implementations, still efficient for most use cases
  • Automatic Detection: No configuration needed, works out of the box

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cuda_kernels-0.1.1.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cuda_kernels-0.1.1-cp310-cp310-win_amd64.whl (7.6 kB view details)

Uploaded CPython 3.10Windows x86-64

File details

Details for the file cuda_kernels-0.1.1.tar.gz.

File metadata

  • Download URL: cuda_kernels-0.1.1.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.0

File hashes

Hashes for cuda_kernels-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7f699e076d8bd56688b2a6a29ccfba17850e285d2fad0d38d00381a2a403c7fe
MD5 37b3b7d07bd40fc3a4bc41c75c4cd527
BLAKE2b-256 b7d54d81eeb778c7bf71b8970a79998dbae2cf1c0120db33619632dabe77b87c

See more details on using hashes here.

File details

Details for the file cuda_kernels-0.1.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for cuda_kernels-0.1.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 d31735fd4994240eaefd627e319b4eb157f234821c2eb431e90043a612e31859
MD5 c14d590be86857dba5e2464a5d893998
BLAKE2b-256 cda170b9f1065838b0f47a1531ff1c0eb10b979cd1a473d62b680419fe52ba27

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page